Clustering & EM Flashcards

1
Q

Clustering

A
  • N d-dimensional dataponts (no labels)
  • goal: partition into K disjoint sets based on similarity
    *
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

K-Means

A
  • define clusters by minimum Euclidean distance to cluster mean
  • Algorithm:
    • K random points as initial cluster centers
    • Assignment E: assign points to closest cluster center
    • Update M: update cluster center (mean of all assigned points)
    • do E M until convergence to local minimum
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Gaussian Mixture Model

A
  • all datapoints are generated from a mixture of a finite number of gaussian distributions with unknown parameters
  • fitting GMM using EM: hard or soft cluster assignments
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

GMM EM soft

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

GMM EM hard

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

K-Means vs GMM

A
  • GMM allows for:
    • unequal cluster variances
    • unequal cluster probabilities
    • non-spherical clusters
    • soft cluster assignments
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Kullback-Leiber divergence

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

EM-summary

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

EM - Properties

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly