Data Mining Flashcards
Data mining
Exploration and analysis of large amounts of data to discover meaningful patterns or rules
Goes beyond summary style analysis -> looks at past data to create a model for the future
Data mining is used where?
In commerce, medicine, industrial process control, law enforcement
Pattern categories
Descriptive: characterise properties of the data
Predictive: perform inference on current data to make predictions
Data mining is used for?
Classification
Estimation
Prediction
Affinity grouping/Association
Clustering
Profiling
Market basket analysis
Analysis of what’s in a customers basket
Why do market basket analysis
Insight in to who customers are
Insight into why customers make certain purchases
Analysis on what products are bought together -> new store layout, promotions etc.
Data mining architecture
Data sources: where data comes from
Data servers: for fetching and storing the data
Knowledge base: has knowledge used to guide search and evaluate usefulness of patterns found
Model base: place for storing interesting models found so that they can be used against the data at a later stage
More data mining architecture
Data mining engine: programs that do different types of data mining functionality
Pattern evaluation module: works with DM engine to focus search towards interesting patterns
User interface: facilitates communication between user and data mining system