KMeans Flashcards

Question 1

Q

Good clustering will produce clusters with:

Question 2

Q

Other distance measures used in Clustering include:

Answer

A

Minkowski distance
Pearson correlation distance
Spearman correlation distance
Kendall correlation distance.

Question 3

Q

Challenges with k-Means Clustering

k-Means is very sensitive to the initial randomly chosen cluster centers (this is known as the ________)

Answer

A

random initialization trap

Question 4

Q

The _______ initialization approach mitigates the effects of the random initialization trap.

Answer

A

K-means++

Question 5

Q

Methods for choosing the right K include:

Answer

A

Elbow Method
Information Criterion Approach
Silhouette method
Jump method
Gap statistic

Question 6

Q

WCSS stands for ________ and is associated with the _____ method for choosing K

Answer

A

Within Cluster Sum of Squares

Elbow Method

Question 7

Q

Strengths of K-Means Clustering?

Answer

A

- Uses simple non-statistical
principles.
- Very flexible and malleable
algorithm.
- Wide set of real-world
applications.

Question 8

Q

Weaknesses of K-Means Clustering?

Answer

A

Simplistic algorithm.
Relies on chance (initial k centroids)
Sometimes requires some domain knowledge in
determining the ideal number of clusters.
Not ideal for non-spherical clusters.
Works with numeric data only.

(8 cards)