The days when everything was reduced to simple terminology like reports or queries are gone. One can see it in the market trends related to reporting or data, as well in the jargon soup the IT people use on the daily basis - Business Intelligence (BI), Data Mining (DM), Analytics, Data Science, Data Warehousing (DW), Machine Learning (ML), Artificial Intelligence (AI) and so on. What’s more confusing for the users and other spectators is the easiness with which all these concepts are used, sometimes interchangeably, and often it feels like nothing makes sense.
BI is used nowadays to refer to the technologies, architectures, methodologies, processes and practices used to transform data into what is desired as meaningful and useful information. From its early beginnings in the 60s, the intelligence from Business Intelligence (BI) refers to the ability to apprehend the interrelationships of the facts to be processed (aka data) in such a way as to guide action towards a desired goal.
The main purpose of BI was and is to guide actions and provide a solid basis for decision making, aspect not necessarily reflected in the way organizations use their BI infrastructure. Except basic operational/tactical/strategic reports and metrics that reflect to a higher or lower degree organizations’ goals, BI often fails to provide the expected value. The causes are multiple ranging from an organizations maturity in devising a strategy and dividing it into SMART goals and objectives, to the misuse of technologies for the wrong purposes.
Despite the basic data analysis techniques, the rich visualizations and navigation functionality, BI fails often to deliver by itself more than ordinary and already known information. Information becomes valuable when it brings novelty, when it can be easily transformed into knowledge, or even better, when knowledge is extracted directly. To address the limitations of the BI a series of techniques appeared in parallel and coined in the 90s as Data Mining.
Mining is the process of obtaining something valuable from a resource. What DM tries to achieve as process is the extraction of knowledge in form patterns from the data by categorizing, clustering, identifying dependencies or anomalies. When compared with data analysis, the main characteristics of DM is the fact that is used to test models and hypotheses, and that it uses a set of semiautomatic and automatic out-of-the-box statistics packages, AI or predictive algorithms with applicability in different areas - Web, text, speech, business processes, etc.
DM proved to be useful by allowing to build models rooted in historical data, models which allowed predicting outcome or behavior, however the models are pretty basic and there’s always a threshold beyond which they can’t go. Furthermore, the costs of preparing the data and of the needed infrastructure seem to be high compared with the benefits data mining provides. There are scenarios in which DM proves to bring benefit, while in others it raises more challenges than can solve. Privacy, security, misuse of information and the blind use of techniques without understanding the data or the models behind, are just some of such challenges.
Information seems too common, while knowledge can become expensive to obtain. The middle way between the two found its future into another buzzword - analytics - the systematic analysis of data or statistics using specific mathematical methods. Analytics combine the agility of data analysis techniques with the power of predictive and prescriptive techniques used in DM in discovering patterns into the data. Analytics attempts to identify why it happens by using a chain of inferences resulted from data’s analyzing and understanding. From another perspective analytics seems to be a rebranded and slightly enhanced version of BI.