Chapter 5. Clustering Flashcards

Question 1

Q

How does clustering work?

P 192

Answer

A

clustering, attempts to group objects together based on similarity. Clustering achieves this without using any labels, comparing how similar the data for one observation is to data for other observations and groups.

Question 2

Q

Before we perform clustering, we will ____.

P 194

Answer

A

reduce the dimensionality of the data -using PCA-.

The clustering algorithms generally perform better, both in terms of time and clustering accuracy, on dimensionality-reduced datasets. P 205

Question 3

Q

What are the three major clustering algorithms?

P 195

Answer

A

k-means
hierarchical clustering
DBSCAN

Question 4

Q

Do we choose the number of clusters before using K-means?

P 196

Answer

A

Yes. In k-means clustering, we specify the number of desired clusters k, and the algorithm will assign each observation to exactly one of these k clusters.

Question 5

Q

How does K-means work?

P 196

Answer

A

The algorithm optimizes the groups by minimizing the within-cluster variation (also known as inertia) such that the sum of the within-cluster variations across all k clusters is as small as possible.

Typically, the k-means algorithm does several runs (n_init) and chooses the run that has the best separation, defined as the lowest total sum of within-cluster variations across all k clusters.

Question 6

Q

In K-means, the inertia decreases as the number of clusters ____.

P 197

Answer

A

Increases
This makes sense; The more clusters we have, the greater the homogeneity among observations within each cluster.

Question 7

Q

“If PCA does a good job of capturing the underlying structure in the data as compactly as possible, the clustering algorithm (K-means) will have an easy time grouping similar instances together, regardless of whether the clustering happens on just a fraction of the principal components or many more.” What does this mean?

P 203

Answer

A

In other words, clustering should perform just as well using 10 or 50 principal components as it does, using one hundred or several hundred principal components.

As we see in the example, the change in overall accuracy when PCA is used is minimal (about 3% for number of components in range(10,749)) but without PCA, changing the number of features used for K-means, results in drastic change in overall accuracy

Chapter 5. Clustering Flashcards

(7 cards)