2020 A3 Flashcards

1
Q

k-means only finds clusters corresponding to a local (but not global) optimum of its objective
function. What does this mean with regard to the quality of the clusters found by k-means?
Give a strategy for reducing the impact of this effect on the quality of the final model.

A

When using gradient decent one might find local minima, get stuck, and converge, resulting in non-optimised final clusters. One can use binary split initialization to reduce this phenomenon.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

k-means has difficulties effectively clustering data containing outliers. Explain why this is the
case, and suggest what could be done to alleviate the problem.

A

Outliers will distort the mean of the cluster. For this reason, outliers should be removed from the sample set during preprocessing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How many parameters are needed to specify a Gaussian mixture model for d-dimensional data
using K components:
◦in the general case (i.e. full covariance matrices)?
◦when the covariance matrices are diagonal?

A

General case - 1/2kd(d+3)
Diagonal - 2kd

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explain why it is often preferable for code working with likelihoods to represent quantities in
the log-domain.

A

When multiplying long sequences of probabilities with values between 0 and 1, the product becomes smaller and smaller over time, resulting in numerical underflow.
Therefore, a log-scale is used and the log-sum trick solves the problem of underflow through summation. The log ensures that the values don’t necessarily lie between 0 and 1 due to the natural log, so they won’t get smaller over time when being multiplied together.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly