2020 A3 Flashcards

Question 1

Q

k-means only finds clusters corresponding to a local (but not global) optimum of its objective
function. What does this mean with regard to the quality of the clusters found by k-means?
Give a strategy for reducing the impact of this effect on the quality of the final model.

Answer

A

When using gradient decent one might find local minima, get stuck, and converge, resulting in non-optimised final clusters. One can use binary split initialization to reduce this phenomenon.

Question 2

Q

k-means has difficulties effectively clustering data containing outliers. Explain why this is the
case, and suggest what could be done to alleviate the problem.

Answer

A

Outliers will distort the mean of the cluster. For this reason, outliers should be removed from the sample set during preprocessing.

Question 3

Q

How many parameters are needed to specify a Gaussian mixture model for d-dimensional data
using K components:
◦in the general case (i.e. full covariance matrices)?
◦when the covariance matrices are diagonal?

Answer

A

General case - 1/2kd(d+3)
Diagonal - 2kd

Question 4

Q

Explain why it is often preferable for code working with likelihoods to represent quantities in
the log-domain.

Answer

A

When multiplying long sequences of probabilities with values between 0 and 1, the product becomes smaller and smaller over time, resulting in numerical underflow.
Therefore, a log-scale is used and the log-sum trick solves the problem of underflow through summation. The log ensures that the values don’t necessarily lie between 0 and 1 due to the natural log, so they won’t get smaller over time when being multiplied together.

2020 A3 Flashcards

(4 cards)