Data Mining Introductions Flashcards
is the science of extracting useful knowledge from huge data repositories.
Data Mining
is an open standard process model.
CRISP-DM REFERENCE MODEL
(Cross Industry Standard Process for Data Mining)
6 TASKS IN CRISP-DM REFERENCE MODEL
- Business Understanding
- Data Understanding
- Data Preparation
- Modeling
- Evaluation
- Deployment
2 DATA MINING METHODS
- Descriptive Method
- Predictive Method
is a method where we find human-interpretable patterns that describe the data.
Descriptive Method
is a method that uses some feature (variables) to predict unknown or future value of other variable.
Predictive Metho
5 DATA MINING TASKS
- Clustering
- Association Rule Discovery
- Regression
- Classification
- Deviation / Anomaly Detection
is a type of data mining task that predicts value of a given continuous valued variable based on the values of other variables.
Regression
is a type of data mining task that detects significant deviation from normal behavior.
Deviation / Anomaly Detection
5 CHALLENGES OF DATA MINING
- Scalability
- Dimensionality
- Complexity and Heterogenous Data
- Data Quality
- Data Ownership and Privacy
3 TYPES OF TOOLS DATA MINING
- Simple Graphical User Interface
- Process Oriented
- Programming Oriented
2 COMMON PROGRAMMING ORIENTED TOOLS
- R
- Python
4 INFO ABOUT DATA WAREHOUSE
- Subject Oriented
- Integrated
- Nonvolatile
- Time Variant
data warehouses are designed to help you analyzed data.
Subject Oriented
integrates data from disparate sources into a consistent format.
Integrated