Data Mining Chapter 1 Flashcards
What two major categories are Data Mining tasks divided in?
Predictive tasks and Descriptive tasks
What is the objective of a Predictive Task?
The objective of a predictive task is to predict the value of a particular attribute based on the values of other attributes.
Prediction variables are know as the explanatory or independent variables.
The to-be-predicted variable is known as the target or dependent variable.
What is the objective of a descriptive task?
The objective of a descriptive task is to derive patterns (correlations, trends, clusters, trajectories, and anomalies) that summarize the underlying relationships in data.
What techniques are there for performing data mining tasks?
- Predictive Modelling
- Anomaly detection
- Association Analysis
- Cluster analysis
There are two types of predictive modelling tasks. Which ones are they and give a brief explanation what they mean.
- Classification, which is used for discrete target variables (non-continuous).
- Regression refers to the prediction of a value of a given continuous valued variable based on the values of other
variables, assuming a linear or nonlinear model of dependency.
Are the following data mining techniques ‘predictive’ or descriptive?
- Predictive Modelling
- Cluster Analysis
- Association Analysis
- Anomaly Detection
- Predictive
- Descriptive
- Descriptive
- Predictive
What is the difference between intra - and inter clusters?
Intra-cluster distances is the distance between members of a cluster whereas inter-cluster is the distance between two cluster sets.
What are applications of:
- Association Analysis
- Cluster Analysis
- Classification
- Regression
- Anomaly Detection
- Market-basket analysis; seeing what people buy together often.
- Document clustering; recommending documents that have similar words.
- Classifying credit card transactions as legitimate or fraudulent.
- Predicting sales amounts of a new product based on advertising expenditure.
- The detection of fraud or network intrusions.