Data mining Flashcards

Question 1

Q

name qualitative

Answer

A

nominal
ordinal

Question 2

Q

name quantitative

Answer

A

interval
ratio

Question 3

Q

Data preprocessing

Answer

A

aggregation
sampling
dimensionality reduction
feature subset selection
feature creation
discretization and binarization
attribute transformation

Question 4

Q

aggregation

Answer

A

combining two or more attributes

Question 5

Q

types of sampling

Answer

A

simple random sampling
sampling with replacement
sampling without replacement
stratified sampling

Question 6

Q

dimensionality reduction

Answer

A

PCA
singular value decomposition

Question 7

Q

feature subset selection

Answer

A

brute-force approach
embedded approach
filter approach
wrapper approach

Question 8

Q

attribute transformation

Answer

A

standardization
normalization

Question 9

Q

pro MIN

Answer

A

can handle non-elliptical shapes

Question 10

Q

limitation MIN

Answer

A

sensitive to noise and outliers

Question 11

Q

pro MAX, group average, ward’s method

Answer

A

less susceptible to noise and outliers

Question 12

Q

limitation average group, ward’s method

Answer

A

biased towards globular clusters

Question 13

Q

limitation MAX

Answer

A

tends to break large clusters
biased towards globular clusters

Question 14

Q

4 advantages of using decision tree

Answer

A

inexpensive to construct
extremely fast for classifying unknown records
easy to interpret
accuracy is comparable to others

Question 15

Q

4 disadvantages of using decision tree

Answer

A

do not generalize well to certain boolean functions
the used induction algorithm is greedy
not expressive enough for modeling continuous variables
tree replication

Question 16

Q

4 disadvantages using MAX

Answer

A

tendency to break large clusters
biased towards global clusters
once a decision is made, can’t be undone
no objective function is minimized

Question 17

Q

classification techniques

Answer

A

decision tree
rule-based
memory-based
neutral networks
support vector machines