Chapter 5. Clustering Flashcards

1
Q

How does clustering work?

P 192

A

clustering, attempts to group objects together based on similarity. Clustering achieves this without using any labels, comparing how similar the data for one observation is to data for other observations and groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Before we perform clustering, we will ____.

P 194

A

reduce the dimensionality of the data -using PCA-.

The clustering algorithms generally perform better, both in terms of time and clustering accuracy, on dimensionality-reduced datasets. P 205

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the three major clustering algorithms?

P 195

A
  • k-means
  • hierarchical clustering
  • DBSCAN
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Do we choose the number of clusters before using K-means?

P 196

A

Yes. In k-means clustering, we specify the number of desired clusters k, and the algorithm will assign each observation to exactly one of these k clusters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does K-means work?

P 196

A

The algorithm optimizes the groups by minimizing the within-cluster variation (also known as inertia) such that the sum of the within-cluster variations across all k clusters is as small as possible.

Typically, the k-means algorithm does several runs (n_init) and chooses the run that has the best separation, defined as the lowest total sum of within-cluster variations across all k clusters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In K-means, the inertia decreases as the number of clusters ____.

P 197

A

Increases
This makes sense; The more clusters we have, the greater the homogeneity among observations within each cluster.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

“If PCA does a good job of capturing the underlying structure in the data as compactly as possible, the clustering algorithm (K-means) will have an easy time grouping similar instances together, regardless of whether the clustering happens on just a fraction of the principal components or many more.” What does this mean?

P 203

A

In other words, clustering should perform just as well using 10 or 50 principal components as it does, using one hundred or several hundred principal components.

As we see in the example, the change in overall accuracy when PCA is used is minimal (about 3% for number of components in range(10,749)) but without PCA, changing the number of features used for K-means, results in drastic change in overall accuracy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly