Clustering & EM Flashcards
1
Q
Clustering
A
- N d-dimensional dataponts (no labels)
- goal: partition into K disjoint sets based on similarity
*
2
Q
K-Means
A
- define clusters by minimum Euclidean distance to cluster mean
- Algorithm:
- K random points as initial cluster centers
- Assignment E: assign points to closest cluster center
- Update M: update cluster center (mean of all assigned points)
- do E M until convergence to local minimum
![](https://s3.amazonaws.com/brainscape-prod/system/cm/319/329/388/a_image_thumb.png?1598609425)
3
Q
Gaussian Mixture Model
A
- all datapoints are generated from a mixture of a finite number of gaussian distributions with unknown parameters
- fitting GMM using EM: hard or soft cluster assignments
![](https://s3.amazonaws.com/brainscape-prod/system/cm/319/329/486/a_image_thumb.png?1598609795)
4
Q
GMM EM soft
A
![](https://s3.amazonaws.com/brainscape-prod/system/cm/319/329/655/a_image_thumb.png?1598609819)
5
Q
GMM EM hard
A
![](https://s3.amazonaws.com/brainscape-prod/system/cm/319/329/677/a_image_thumb.png?1598609840)
6
Q
K-Means vs GMM
A
- GMM allows for:
- unequal cluster variances
- unequal cluster probabilities
- non-spherical clusters
- soft cluster assignments
7
Q
Kullback-Leiber divergence
A
![](https://s3.amazonaws.com/brainscape-prod/system/cm/319/329/799/a_image_thumb.png?1598610081)
8
Q
EM-summary
A
![](https://s3.amazonaws.com/brainscape-prod/system/cm/319/329/899/a_image_thumb.png?1598610209)
9
Q
EM - Properties
A
![](https://s3.amazonaws.com/brainscape-prod/system/cm/319/330/017/a_image_thumb.png?1598610259)