DATABASE MINING CONCEPTS



Data Mining is the mining, or discovery, of new information in terms of patterns or rules from vast amounts of data. To be useful, data mining must be carried out efficiently on large files and databases.

Goals of Data Mining

  • Prediction: Determine how certain attributes will behave in the future. For example, how much sales volume a store will generate in a given period.

  • Identification: Identify patterns in data. For example, newly wed couples tend to spend more money buying furnitures.

  • Classification: Partition data into classes. For example, customers can be classified into different categories with different behavior in shopping.

  • Optimization: Optimize the use of limited resources such as time, space, money or materials. For example, how to best use advertising to maximize profits (sales).

    Types of Knowledge Discovered during Data Mining

  • Association rules: For example, when a male shopper buys a new car, he is likely to buy a car CD.

  • Classification hierarchies: For example, mutual funds may be classified into three categories: growth, income and stable.

  • Sequence patterns: Sequence patterns are temporal associations. For example, if mortgage interest rate drops, within six months period the sales of houses will increase by certain percentage.

  • Patterns within time series: such as stock price data behavior in time.

  • Detection of Similarity, or segmentation: For example, health data may indicate similarity among subgroups of people.

    Techniques of Data Mining

    Data mining is closely related to Knowledge Discovery in Databases (KDD). The techniques of KDD includes: pattern recognition, clustering, classification tree, case-based reasoning, AI techniques, statistics, neural networks and others.

    Applications of Data Mining

  • Marketing

  • Finance

  • Manufacturing

  • Health Care

    Commercial Data Mining Tools

    Intelligent Miner from IBM applies classification and association rules to detect rules and patterns and make predictions.

    Enterprise Miner from SAS applies decision trees, neural nets, clustering techniques, statistics, association rules.

    Many new tools are coming out on the market in recent years, making data mining a very active research and development area.