Chapter 9: Unsupervised Machine Learning Flashcards

1
Q

What is a popular algorithm for identifying clusters?

A

k-means

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is k-means?

A

K in k-means represents the number of clusters, or groupings.
Note that k-means works only for numerical data; that is, all attributes being considered have numerical values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

K-means is acceptable to use for master data?

A

incorrect, only transactional data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What can be used for clustering categorical data?

A

k-modes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the optimum number of clusters?

A

between 1 and N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is cluster size?

A

the number of members within a cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is cluster density?

A

more customers are in a cluster versus another cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is cluster distance?

A

Indicates how dissimilar customers in one cluster are from customers in another cluster.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is an association analysis or “affinity analysis?

A

A type of unsupervised descriptive data model, it is used to find the hidden connections between sets of items the frequently occur together.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

“{PIZZA-BY-THE-SLICE} → {SOFT DRINK 20 OZ}” is an example of what?

A

association rules or associational analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Businesses can use these association rules to do what?

A

To promote and recommend items that often occur together.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is an example of a {antecedent(s)} → {consequent(s)}?

A

{Knee Pads} → {Off Road Helmet}

{Deluxe Touring Bike Black, Elbow Pads} → {Off Road Helmet}

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the support rule in association rules?

A

Support for a rule is the fraction or percentage of transactions that contain all of the items within the rule

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Excessive numbers of rules represent an ______ to effective analysis

A

obstacle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Confidence is described as?

A

Confidence is the measure or probability of the consequent items in transactions that contain the antecedent items. It is a conditional probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is used to measure the absence of the antecedent?

A

a Lift

17
Q

What is a Lift?

A

Lift is the measure of how accurately a rule depicts affinity or association compared to the random (coincidental) co-occurrence of the items.

18
Q

Lift = _______________________?

A

Lift = confidence of a rule / support of the consequent

19
Q

Lift values greater than _______ imply that the antecedent and consequent are associated (correlated) with each other..

A

1

20
Q

Lift values less than 1 imply what?

A

The two items are negatively correlated.

21
Q

What does a apriori algorithm do?

A

Makes the association analysis straightforward and trims out the infrequent rules by default.