Hierarchical Clustering Flashcards
What are the 2 stratagies for hierarchical structuring
Top-down: start with one cluster, recursivly split
Bottom-up: start with singleton clusters, merge
How does aglomerative clustering work?
- Start with a cluster for each point
- Find the pair of clusters that are clostest
- Merge the clusters
- Repeat until 1 cluster
What is a hierarchical tree of clusters diagram called?
Dendrogram
Whats the runtime complexity of aglormerative clustering?
O(n2d + n3)
What is single link cluster distance measure?
Distance between clostest points
What is complete link cluster distance measure?
Distance between farthest elements in cluster
What is the problem with single link?
Produces long chains (often we want spherical clumps)
What does complete link prefer (cluster shape)?
Spherical clusters with constant diameter
What is average link cluster distance measure?
What is centroid cluster distance measure?
Distance between centroids of clusters
What is Ward’s method cluster distance measure?
The total distance from all points from the centroid formed if the two clusters where merged
What is the Lance-Williams algorithm?
- Compute D, the distance matrix between each point (singleton centroid)
- For N iterations
- Pick smallest Di,j
- Delete Di and Dj and add Di+j
- For each k != i or j update the distance (where alpha and beta are constants based on the cluster distance measure used)