Segmentation and Clustering p2 - Lecture 8 - Week 4 Flashcards

Question 1

Q

How are clusters modelled in Gaussian Mixture Models

Answer

A

As Gaussians, not just by their mean

Question 2

Q

What are the parameters to the univariate Normal (Gaussian) Distribution?

Answer

A

The mean and the variance

Question 3

Q

What are the parameters to the multivariate Normal Distribution?

Answer

A

a vector containing the mean position
A symmetric “positive definite” covariance matrix

Question 4

Q

What are the three types of covariance

Answer

A

Spherical, diagonal and full

Question 5

Q

What is a generative model?

Answer

A

Instead of treating the data as a bunch of points, assume that they are all generated by sampling a continuous function

This is defined by a vector of parameters theta

Question 6

Q

How can we model data with multiple clusters with multiple gaussians?

Answer

A

By starting with parameters describing each cluster
Mean muc, variance sigmac, “size” pic
This models a probability distribution of the cluster of:
p(x) = Sumc(pic * N(x ; muc, sigmac))

Question 7

Q

What is expectation Maximisation (EM)?

Answer

A

Goal:
FInal blob parameters theta that maximise the likelihood function:
P(data|theta) = PI(x)(x|theta)

Approach:
1. E-Step: Given current guess of blobs, compute ownership of each point
2. M-step: given ownership probabilities, update blobs to maximise likelihood function
3. Repeat until convergence

Question 8

Q

What is expectation maximisation (EM) useful for?

Answer

A

Any clustering problem
Any model estimation problem
Missing data problems
Finding outliers
Segmentation problems
- Based on colour
- Based on motion
- Foreground/background seperation

Question 9

Q

Pros and cons of mixture of gaussians / expectation maximisation (EM)?

Answer

A

Pros:
- Probabilistic interpretation
- Soft assignments between data points and clusters
- Generative model, can predict novel data points
- Relatively compact storage

Cons:
- Local minima
- Initialisation
- Often a good idea to start with some k-means iterations
- Need to know number of components (number of clusters)
- Need to choose a generative model
- Numerical problems are often a nuisance

Segmentation and Clustering p2 - Lecture 8 - Week 4 Flashcards

(9 cards)