Clustering Flashcards

1
Q

What is unsupervised learning?

A

Learning patterns in data without labeled outputs or a “teacher”.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the goal of clustering?

A

To partition data into groups or clusters based on similarity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does the K-means algorithm minimize?

A

The within-cluster point scatter/variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the two main steps of the K-means algorithm?

A

1) Assign points to nearest cluster center, 2) Update cluster centers. Until convergence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the K-means++ algorithm used for?

A

To initialize the cluster centers for K-means in a way that improves convergence (by spreading out the initial cluster centers)`

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can the number of clusters K be determined in non-probabilistic models?

A

By computing the Mean Square Error (MSE) for different values of K and using Elbow Method or picking K value where SSE has a change of slope.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a mixture model?

A

A probabilistic model that represents the presence of subpopulations within an overall population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the two steps in generating samples from a Gaussian mixture model?

A

1) Draw a categorical variable Z to select a component, 2) Draw an observation from the selected Gaussian component.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the EM algorithm used for in mixture models?

A

To estimate the parameters of the mixture model by maximizing the likelihood.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the two main steps of the EM algorithm?

A

The Expectation (E) step and the Maximization (M) step (Repeat until convergence as in stable assignments and parameters)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In the context of Gaussian Mixture Models, what does the E-step compute?

A

the E-step computes the expected value of the latent variables, specifically the posterior probabilities (responsibilities) that each data point belongs to each Gaussian component.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In the context of Gaussian Mixture Models, what does the M-step compute?

A

the M-step updates the parameters of the model (means, covariances, and mixing coefficients) to maximize the expected log-likelihood found in the E-step. This involves re-estimating the parameters to better fit the data based on the current responsibilities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How does K-means compare to Gaussian Mixture Models?

A

K-means is usually faster due to fewer iterations and less computation. K-Means assumes spherical clusters with equal variance while GMM can have clusters with different shapes and sizes. GMM also have soft-assignment (probability of belonging to each cluster) while K-Means have hard assignments.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is an advantage of mixture models over K-means?

A

Mixture models allow for distributional assumptions and can assess the fit of the data by computing likelihood.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is an issue with mixture models in terms of identifiability?

A

The likelihood is invariant to permutation of class memberships, making the estimators valid only up to permutation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is hierarchical agglomerative clustering?

A

A clustering method that starts with singleton clusters and merges the most similar clusters iteratively.

17
Q

What is the purpose of the Bayes classifier in mixture models?

A

To obtain class posterior probabilities when parameters are known: P(y|x, Θ) ∝ fΘ(x|y)πy

18
Q

What is the main challenge in maximizing the likelihood for mixture models

A

The lack of a closed-form solution, requiring an iterative procedure like EM.

19
Q

What is the relationship between K-means and Gaussian Mixture Models?

A

K-means is equivalent to a special case of GMM where all clusters have the same diagonal covariance matrix as σ approaches 0.

20
Q

What is a disadvantage of mixture models compared to K-means?

A

Mixture models require explicit distributional assumptions.

21
Q

How does the EM algorithm’s performance depend on initialization?

A

EM’s performance can vary significantly based on parameter initialization due to multiple local maxima.

22
Q

How can missing values be handled in mixture models?

A

Mixture models can naturally infer missing values as part of the model fitting process.