What is Data Mining?

What is Data Mining?

Data mining is the process of sifting through large quantities of information to gain insight into the underlying processes. A classic data-mining example is law enforcement, where officers may comb through reams of information (phone records, credit card receipts, noted meetings, and so forth) to identify the relationships in a crime syndicate.                               

Another form of data mining is running volumes of transactional data through a process to find patterns in the transactions. An example of this form of data mining is crunching through years of sales receipts for a grocery store to identify buying patterns of customers. This type of data mining is a perfect application of OLAP technologies, because it is dependent on aggregation of data. An interesting aspect of this use of the OLAP engine is that you most likely won't be operating on a cube. Instead, you will create a data-mining model, train it on transactional data, and use it to process transactional data. To some degree, data-mining engines coexist in the same box as multidimensional cubes, but they are only tangentially related.

Data mining is a powerful technology with great potential to help companies focus on the most important information in their data warehouses. Data mining tools predict future trends and behaviors, allowing businesses to make proactive, knowledge-driven decisions.

Data mining is primarily used today by companies with a strong consumer focus - retail, financial, communication, and marketing organizations. It enables these companies to determine relationships among "internal" factors such as price, product positioning, or staff skills, and "external" factors such as economic indicators, competition, and customer demographics. And, it enables them to determine the impact on sales, customer satisfaction, and corporate profits. Finally, it enables them to "drill down" into summary information to view detail transactional data.

Data mining consists of five major elements:

  • Extract, transform, and load transaction data onto the data warehouse system.
  • Store and manage the data in a multidimensional database system.
  • Provide data access to business analysts and information technology professionals.
  • Analyze the data by application software.
  • Present the data in a useful format, such as a graph or table.