Lec6 - Unsupervised Learning Flashcards

1
Q

What is the difference between Supervised and Unsupervised Learning?

A

In supervised learning the data is labelled while in unsupervised learning it is unlabelled.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is clustering?

A

Clustering is trying to group data together in the feature space.
A cluster is a collection of data items which are “similar” between them, and “dissimilar” to data items in other clusters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe the k-means algorithm.

A
  1. Choose number of clusters k
  2. Randomly place k centroids in the feature space
  3. Assign each datapoint to the nearest centroid using a distance metric such as Euclidian Distance.
  4. Update the centroid positions by computing the mean position of all the data-points associated to the centroid.
    5 Repeat 3-4 until convergence.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the elbow method

A

The elbow method is a method of selecting the best k for k-means.

We run k-means for different values of k (eg. 1 - 10) and plot a graph of the score. The score is the average of the distances to each centroid. We then select the k where the rate of decrease of the score sharply shifts (which looks like an elbow joint on the graph)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are two methods of selecting k for k-means?

A
  • Elbow Method

- Cross Validation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe the strengths and weakness of k-means.

A

Strengths:

  • Simple to understand and implement
  • Efficient
  • Popular

Weaknesses:

  • We have to define k
  • Algorithm only applicable when mean is defined.
  • Sensitive to Initialisation
  • Sensitive to Outliers
  • Not suitable for discovering clusters which are not hyper-ellipsoids/spheres
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is one of the main applications of Density Estimation?

A

One of the main applications of density estimation is anomaly detection or novelty detection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a probability density function (PDF)?

A

A PDF models the probability of a sample to be generated in a specific area.

It is very likely or usual to observe samples where the PDF is high. Conversely, it is rare to observe samples where the PDF is low.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Give the equation of a univariate Gaussian distribution and name its parameters.

A

Parameters:
mean μ, variance σ^2

N(x | μ, σ) = (1 / sqrt(2πσ^2))exp(-(x - μ)^2 /2σ^2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the likelihood p(x | θ) and why do we often compute the negative log likelihood instead of the likelihood directly?

A

The likelihood p(x | θ) is the probability of observing our data x given our parameters θ. We often compute the negative log likelihood because it provides much more numerically stable results, turning the product of terms to a sum of terms. The negation is because we often try to minimise during optimisation instead of maximising.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a Gaussian Mixture Model (GMM)?

A

It is a linear combination of different families of Gaussians.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Expectation Maximisation (EM)?

A

EM is an iterative approach to finding parameters, which can be used with GMMs. It’s composed of two steps:
1. E-step: Compute responsibilities r_nk (corresponds to the posterior probability of data point n to belong to the mixture component k)
2. M-step: Use the updated responsibilities to re-estimate the parameters θ.
Repeat

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe the GMM-EM Algorithm

A
  1. Initialise
  2. E-Step:
    - Compute the Responsibilities
  3. M-Step:
    - Update the weight.
    - Update the mean.
    - Update the covariance.
    Repeat E and M steps until convergence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Give two ways of determining convergence in GMMs.

A
  • No significant variation of the parameters

- Stagnation of the likelihood.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe the differences between k-means and GMMs, and mention which is the key difference between them.

A
K-means:
- Objective function: Minimises sum of squared Euclidean
GMM-EM
-Can be optimised by an EM algorithm
  - E-step: assign points to clusters
  - M-step: optimise clusters
- Performs hard assignment during E-step
- Assumes spherical clusters with equal probability of a 

GMM-EM:

  • Objective function: Maximise log-likelihood
  • EM algorithm
    • E-step: Compute posterior probability of membership
    • M-step: Optimise parameters
  • Perform soft assignment during E-step
  • Can be used for non-spherical clusters
  • Can generate clusters with different probabilities
How well did you know this?
1
Not at all
2
3
4
5
Perfectly