Lecture 10 - Clustering: K-Means Flashcards

1
Q

What is the main goal of K-Means clustering?

A

To partition a dataset into π‘˜ clusters by minimizing the sum of squared distances (reconstruction error) between data points and their respective cluster centroids.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the steps in the K-Means algorithm?

A
  1. Initialize π‘˜ centroids randomly.
  2. Assign each data point to the nearest centroid.
  3. Recalculate the centroids as the mean of all points in a cluster.
  4. Repeat steps 2–3 until convergence (e.g., no changes in cluster assignments).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How is convergence determined in K-Means?

A

When the reconstruction error (sum of squared distances) stabilizes.

When the change in error between iterations is below a threshold.

When the maximum number of iterations is reached.

When cluster assignments stop changing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the β€œelbow method” in K-Means?

A

A technique to find the optimal number of clusters π‘˜ by plotting the reconstruction error against π‘˜. The β€œelbow point” indicates diminishing returns in error reduction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a major limitation of K-Means?

A

It assumes clusters are spherical and evenly sized, making it unsuitable for datasets with arbitrary cluster shapes or varying densities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is reconstruction error in K-Means?

A

The sum of squared distances between data points and their assigned cluster centroid.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the types of cluster representations in K-Means?

A

Hard clustering: Each point belongs to exactly one cluster.

Soft clustering: Points have a degree of belonging to multiple clusters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly