Quiz #2 Flashcards

Exam Prep

1
Q
Which of these is a method employed by the k-means algorithm to mitigate the effects of the random initialization trap?
  A. Random initialization escape
  B. K-means++
  C. Centroid placement
  D. K-medoids
A

B. K-means++

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
In Association rules, a collection of one or more items is known as \_\_\_\_\_\_\_\_\_\_\_\_\_\_.
  A. a set of items
  B. a ruleset
  C. a set of rules
  D. an itemset
A

D. an itemset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

K-Means clustering is useful in creating non-spherical clusters.
True
False

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
The increased likelihood that a rule occurs in a dataset relative to its typical rate of occurrence is known as \_\_\_\_\_\_\_\_\_\_.
  A. Lift
  B. Count
  C. Support
  D. Confidence
A

A. Lift

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

As we discussed in class, the elbow method makes use of the Within Cluster Sum of Squares (WCSS) metric to suggest the appropriate value for “k”. If we keep increasing the value for “k”, what will happen to the value for WCSS?

A. The value for WCSS will tend towards 0.
B. The value for WCSS will tend towards 1.
C. The value for WCSS will eventually become negative.
D. The value for WCSS will grow infinitely.

A

A. The value for WCSS will tend towards 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q
TID	Items Bought
T1	bread, milk, beer
T2	bread, diaper, beer, eggs
T3	milk, diaper, beer, coke
T4	bread, milk, diaper, beer
T5	bread, milk, diaper, coke

What is the support of the itemset {beer, coke} in the dataset above?

A. 4
B. 0.4
C. 0.2
D. 1

A

C. 0.2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
The Amelia package in R is useful for dealing with \_\_\_\_\_\_\_\_\_\_\_ data.
  A. imbalanced
  B. skewed
  C. missing
  D. aggregate
A

C. missing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Association rules imply causality in that they explain why item B is bought whenever item A is bought.
True
False

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
Good clustering will produce clusters with \_\_\_\_\_ inter-class similarity and \_\_\_\_\_\_ intra-class similarity.
  A. low, low
  B. low, high
  C. high, low
  D. high, high
A

B. low, high

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
The anti-monotone property of support states that the support of an itemset is \_\_\_\_\_\_\_\_\_ than that of its subsets.
  A. always more
  B. always less
  C. sometimes more
  D. sometimes less
A

B. always less

D. sometimes less

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

One of the strengths of association rules is that they are easy to understand.
True
False

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

K-Means clustering only works with numeric data.
True
False

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q
Which of these is NOT a method used in choosing the appropriate value for "k"?
  A. Elbow Method
  B. A priori knowledge
  C. Gap statistic
  D. Ankle Method
A

D. Ankle Method

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q
Which of these is a distance measure employed by k-means clustering?
  A. Euclidean distance
  B. Centroid distance
  C. Manhattan distance
  D. Cluster distance
A

A. Euclidean distance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q
A clustering method in which every object belongs to every cluster with a membership weight that goes between 0 (if it absolutely doesn’t belong to the cluster) and 1(if it absolutely belongs to the cluster) is known as \_\_\_\_\_\_\_ clustering.
  A. overlapping
  B. partitional
  C. hierarchical
  D. fuzzy
A

D. fuzzy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q
The confidence of an association rule is the \_\_\_\_\_\_\_\_\_\_ of the rule.
  A. support strength
  B. complete coverage
  C. likelihood level
  D. predictive power
A

D. predictive power

17
Q
One of the ways to reduce the computational complexity of frequent itemset generation is by the use of \_\_\_\_\_\_\_\_\_\_\_\_.
  A. the apriori algorithm
  B. the FP-Growth algorithm
  C. the association rules algorithm
  D. the post-pruning algorithm
A

A. the apriori algorithm

18
Q
Clustering results in labels against previously unlabeled data, that is why it is sometimes referred to as \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_.
  A. predictive partitioning
  B. supervised labeling
  C. predictive labeling
  D. unsupervised classification
A

D. unsupervised classification

19
Q

Association rules are great with small data sets.
True
False

A

False

20
Q

Clustering results in labels against previously unlabeled data, that is why it is sometimes referred to as unsupervised classification.
True
False

A

True