Data Science

Data Mining (Definitions)

“Data mining is the process of examining large amounts of aggregated data. The objective of data mining is to either predict what may happen based on trends or patterns in the data or to discover interesting correlations in the data.” (Microsoft Corporation, “Microsoft SQL Server 7.0 Data Warehouse Training Kit”, 2000)

“A class of undirected queries, often against the most atomic data, that seek to find unexpected patterns in the data. The most valuable results from data mining are clustering, classifying, estimating, predicting, and finding things that occur together. There are many kinds of tools that play a role in data mining. The principal tools include decision trees, neural networks, memory- and cased-based reasoning tools, visualization tools, genetic algorithms, fuzzy logic, and classical statistics. Generally, data mining is a client of the data warehouse.” (Ralph Kimball & Margy Ross, “The Data Warehouse Toolkit” 2nd Ed., 2002)

“An information extraction activity whose goal is to discover hidden facts contained in databases. Using a combination of machine learning, statistical analysis, modeling techniques, and database technology, data mining finds patterns and subtle relationships in data and infers rules that allow the prediction of future results.” (Sharon Allen & Evan Terry, “Beginning Relational Data Modeling” 2nd Ed., 2005)

“The process of sifting through large amounts of data using pattern recognition, fuzzy logic, and other knowledge discovery statistical techniques to identify previously unknown, unsuspected, and potentially meaningful data content relationships and trends.” (DAMA International, “The DAMA Dictionary of Data Management”, 2011)

“Field of analytics with structured data. The model inference process minimally has four stages: data preparation, involving data cleaning, transformation and selection; initial exploration of the data; model building or pattern identification; and deployment, putting new data through the model to obtain their predicted outcomes.” (Gary Miner et al, “Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications”, 2012)

“The practice of analyzing big data using mathematical models to develop insights, usually including machine learning algorithms as opposed to statistical methods.”(Brenda L Dietrich et al, “Analytics Across the Enterprise”, 2014)

“A class of analytical applications that help users search for hidden patterns in a data set. Data mining is a process of analyzing large amounts of data to identify data–content relationships. Data mining is one tool used in decision support special studies. This process is also known as data surfing or knowledge discovery.” (Daniel J Power & Ciara Heavin, “Decision Support, Analytics, and Business Intelligence” 3rd Ed., 2017)

“The process of identifying commercially useful patterns or relationships in databases or other computer repositories through the use of advanced statistical tools.” (Microsoft)

“Data mining is the process of discovering meaningful correlations, patterns and trends by sifting through large amounts of data stored in repositories. Data mining employs pattern recognition technologies, as well as statistical and mathematical techniques.” (Gartner)

“Data mining is the work of analyzing business information in order to discover patterns and create predictive models that can validate new business insights. […] Unlike data analytics, in which discovery goals are often not known or well defined at the outset, data mining efforts are usually driven by a specific absence of information that can’t be satisfied through standard data queries or reports. Data mining yields information from which predictive models can be derived and then tested, leading to a greater understanding of the marketplace.” (Informatica)

