Midterm Flashcards

1
Q

Classification Accuracy

A

percentage of data correctly classified

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Classification Coverage

A

percentage of data to which the classification rule applies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Supervised learning (include example)

A

training data includes class labels (ex. classification)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Unsupervised learning (include example)

A

training data does uses unlabeled data (ex. clustering)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Semi-supervised learning

A

uses both labeled and unlabeled training data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Issues with data mining

A

Methodologies, user interaction, efficiency and scalability, diversity of data types, data mining and society

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Types of quantitative attributes

A

Nominal, binary, boolean, Ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Types of qualitative/numeric attributes

A

Interval-scaled, ratio-scaled, discrete, continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Nominal attributes

A

Symbols or names (categorical)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Binary attributes

A

0 or 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Boolean attibutes

A

True or false

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Ordinal attributes

A

Values have order/rank, but magnitude between values unknown (ex. large and small)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Interval-scaled attributes

A

measured on a scale of equal size unites but no zero point (i.e. temperature)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Ratio-scaled attributes

A

measured on a scale of equal-sized unites with a zero point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Discrete attribues

A

finite or countably infinite set of values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Continuous attributes

A

not discrete, such as floating point numbers

17
Q

Data quality factors

A

accuracy, completeness, consistency, timeliness, believability, interpretability

18
Q

Steps of data preprocessing

A

Cleaning, integration, reduction, transformation/discretization

19
Q

Data cleaning

A

Fill in missing values, smooth out noise, ID outliers, correct inconsistencies

20
Q

Noise

A

random error or variance in a measured variable

21
Q

Noise smoothing techniques

A

Binning, Regression, Outlier Analysis

22
Q

Data integration issues

A

Entity ID problem, attribute correlation, tuple duplication, data value conflicts