Data Mining is the mining, or discovery, of new
information in terms of patterns or rules from vast
amounts of data. To be useful, data mining must
be carried out efficiently on large files and databases.
Goals of Data Mining
Prediction: Determine how certain attributes will behave
in the future. For example, how much sales volume a store
will generate in a given period.
Identification: Identify patterns in data. For example,
newly wed couples tend to spend more money buying furnitures.
Classification: Partition data into classes. For example,
customers can be classified into different categories with
different behavior in shopping.
Optimization: Optimize the use of limited resources such
as time, space, money or materials. For example, how to best
use advertising to maximize profits (sales).
Types of Knowledge Discovered during Data Mining
Association rules: For example, when a male shopper buys
a new car, he is likely to buy a car CD.
Classification hierarchies: For example, mutual funds may be
classified into three categories: growth, income and stable.
Sequence patterns: Sequence patterns are temporal associations.
For example, if mortgage interest rate drops, within six months
period the sales of houses will increase by certain percentage.
Patterns within time series: such as stock price data
behavior in time.
Detection of Similarity, or segmentation: For example, health
data may indicate similarity among subgroups of people.
Techniques of Data Mining
Data mining is closely related to Knowledge Discovery in Databases
(KDD). The techniques of KDD includes: pattern recognition,
clustering, classification tree, case-based reasoning,
AI techniques, statistics, neural networks and others.
Applications of Data Mining
Marketing
Finance
Manufacturing
Health Care
Commercial Data Mining Tools
Intelligent Miner from IBM applies classification and
association rules to detect rules and patterns and
make predictions.
Enterprise Miner from SAS applies decision trees, neural
nets, clustering techniques, statistics, association rules.
Many new tools are coming out on the market in recent
years, making data mining a very active research and
development area.