Lecture 5 (Data Mining) Flashcards
Why Data Mining?
More intense competition
Recognition of the value in data sources
Availability of quality data on customers, vendors, transactions
Definition of Data Mining?
The nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data stored in structured databases.
Data Mining Characteristics and Objectives?
Source of data for DM is often consolidated data warehouse
DM environment is usually a client-server or a Web-based information system architecture
Data is the most critical ingredient for DM which may include soft/unstructured data
The miner is the end user
How data mining works?
DM extract patterns from data
Types of patterns in data mining?
Association
Prediction
Cluster
Sequential
Association methods?
Market-basket
Link analysis
Sequence analysis
Prediction methods?
Classification
Regression
Time Series
Segmentation methods?
Clustering
Outlier analysis
Supervised Learning problems?
Classification
- The domain of the target is finite and categorical
- A classifier must assign a class to an unseen example
Regression
- The target attribute is formed by infinite values
- To fit a model to learn the output target attribute as a function of input attributes
Time Series Analysis
- Making predictions in time
Unsupervised Learning Problems?
Clustering
Association Rules
Pattern Mining
- It is adopted as more general term than frequent pattern mining or association mining
Outlier Detection
- Ot is the process of finding data examples with behaviours that are very different from the expectation
Data Mining Applications?
Customer Relationship Management Banking and Other Financial Retailing and Logistics Manufacturing and Maintenance Brokerage and Securities Trading Insurance Computer Hardware and Software Science and Engineering Government and Defense Homeland security and law enforcement Travel, entertainment, sports Healthcare and medicine Sports, virtually everywhere
Customer Relationship Management?
Maximize return on marketing campaigns
Improve customer retention
Maximize customer value
Identify and treat most valued customers
Banking and Other Financial?
Automate the loan application process
Detecting fraudulent transactions
Maximize customer value
Optimizing cash reserves with forecasting
Retailing and Logistics?
Optimize inventory levels at different locations
Improve the store layout and sales promotions
Optimize logistics by predicting seasonal effects
Minimize losses due to limited shelf life
Manufacturing and Maintenance?
Predict/prevent machinery failures
Identify anomalies in production systems to optimize the use manufacturing capacity
Discover novel patterns to improve product quality