Data Mining Chapter 1 Flashcards

1
Q

What two major categories are Data Mining tasks divided in?

A

Predictive tasks and Descriptive tasks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the objective of a Predictive Task?

A

The objective of a predictive task is to predict the value of a particular attribute based on the values of other attributes.

Prediction variables are know as the explanatory or independent variables.
The to-be-predicted variable is known as the target or dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the objective of a descriptive task?

A

The objective of a descriptive task is to derive patterns (correlations, trends, clusters, trajectories, and anomalies) that summarize the underlying relationships in data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What techniques are there for performing data mining tasks?

A
  1. Predictive Modelling
  2. Anomaly detection
  3. Association Analysis
  4. Cluster analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

There are two types of predictive modelling tasks. Which ones are they and give a brief explanation what they mean.

A
  1. Classification, which is used for discrete target variables (non-continuous).
  2. Regression refers to the prediction of a value of a given continuous valued variable based on the values of other
    variables, assuming a linear or nonlinear model of dependency.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Are the following data mining techniques ‘predictive’ or descriptive?

  1. Predictive Modelling
  2. Cluster Analysis
  3. Association Analysis
  4. Anomaly Detection
A
  1. Predictive
  2. Descriptive
  3. Descriptive
  4. Predictive
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the difference between intra - and inter clusters?

A

Intra-cluster distances is the distance between members of a cluster whereas inter-cluster is the distance between two cluster sets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are applications of:

  1. Association Analysis
  2. Cluster Analysis
  3. Classification
  4. Regression
  5. Anomaly Detection
A
  1. Market-basket analysis; seeing what people buy together often.
  2. Document clustering; recommending documents that have similar words.
  3. Classifying credit card transactions as legitimate or fraudulent.
  4. Predicting sales amounts of a new product based on advertising expenditure.
  5. The detection of fraud or network intrusions.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly