Lec6 - Unsupervised Learning Flashcards

Question 1

Q

What is the difference between Supervised and Unsupervised Learning?

Answer

A

In supervised learning the data is labelled while in unsupervised learning it is unlabelled.

Question 2

Q

What is clustering?

Answer

A

Clustering is trying to group data together in the feature space.
A cluster is a collection of data items which are “similar” between them, and “dissimilar” to data items in other clusters.

Question 3

Q

Describe the k-means algorithm.

Answer

A

Choose number of clusters k
Randomly place k centroids in the feature space
Assign each datapoint to the nearest centroid using a distance metric such as Euclidian Distance.
Update the centroid positions by computing the mean position of all the data-points associated to the centroid.
5 Repeat 3-4 until convergence.

Question 4

Q

What is the elbow method

Answer

A

The elbow method is a method of selecting the best k for k-means.

We run k-means for different values of k (eg. 1 - 10) and plot a graph of the score. The score is the average of the distances to each centroid. We then select the k where the rate of decrease of the score sharply shifts (which looks like an elbow joint on the graph)

Question 5

Q

What are two methods of selecting k for k-means?

Answer

A

Elbow Method

- Cross Validation

Question 6

Q

Describe the strengths and weakness of k-means.

Answer

A

Strengths:

Simple to understand and implement
Efficient
Popular

Weaknesses:

We have to define k
Algorithm only applicable when mean is defined.
Sensitive to Initialisation
Sensitive to Outliers
Not suitable for discovering clusters which are not hyper-ellipsoids/spheres

Question 7

Q

What is one of the main applications of Density Estimation?

Answer

A

One of the main applications of density estimation is anomaly detection or novelty detection.

Question 8

Q

What is a probability density function (PDF)?

Answer

A

A PDF models the probability of a sample to be generated in a specific area.

It is very likely or usual to observe samples where the PDF is high. Conversely, it is rare to observe samples where the PDF is low.

Question 9

Q

Give the equation of a univariate Gaussian distribution and name its parameters.

Answer

A

Parameters:
mean μ, variance σ^2

N(x | μ, σ) = (1 / sqrt(2πσ^2))exp(-(x - μ)^2 /2σ^2)

Question 10

Q

What is the likelihood p(x | θ) and why do we often compute the negative log likelihood instead of the likelihood directly?

Answer

A

The likelihood p(x | θ) is the probability of observing our data x given our parameters θ. We often compute the negative log likelihood because it provides much more numerically stable results, turning the product of terms to a sum of terms. The negation is because we often try to minimise during optimisation instead of maximising.

Question 11

Q

What is a Gaussian Mixture Model (GMM)?

Answer

A

It is a linear combination of different families of Gaussians.

Question 12

Q

What is Expectation Maximisation (EM)?

Answer

A

EM is an iterative approach to finding parameters, which can be used with GMMs. It’s composed of two steps:
1. E-step: Compute responsibilities r_nk (corresponds to the posterior probability of data point n to belong to the mixture component k)
2. M-step: Use the updated responsibilities to re-estimate the parameters θ.
Repeat

Question 13

Q

Describe the GMM-EM Algorithm

Answer

A

Initialise
E-Step:
- Compute the Responsibilities
M-Step:
- Update the weight.
- Update the mean.
- Update the covariance.
Repeat E and M steps until convergence

Question 14

Q

Give two ways of determining convergence in GMMs.

Answer

A

No significant variation of the parameters

- Stagnation of the likelihood.

Question 15

Q

Describe the differences between k-means and GMMs, and mention which is the key difference between them.

Answer

A

K-means:
- Objective function: Minimises sum of squared Euclidean
GMM-EM
-Can be optimised by an EM algorithm
  - E-step: assign points to clusters
  - M-step: optimise clusters
- Performs hard assignment during E-step
- Assumes spherical clusters with equal probability of a

GMM-EM:

Objective function: Maximise log-likelihood
EM algorithm
- E-step: Compute posterior probability of membership
- M-step: Optimise parameters
Perform soft assignment during E-step
Can be used for non-spherical clusters
Can generate clusters with different probabilities