10 Clustering Flashcards
List the four iterative steps of standard K‑means.
(1) Randomly initialize k centroids. (2) Assign each point to its nearest centroid. (3) Recompute each centroid as the mean of its assigned points. (4) Repeat assignment + update until no assignments change or max iterations reached.
What objective does K‑means explicitly minimize?
The within‑cluster sum of squared Euclidean distances (total inertia).
Why is the global optimum of K‑means difficult to obtain?
Jointly searching for optimal point–centroid assignments and centroid locations is NP‑hard because the objective is non‑convex with many local minima.
Define single‑linkage distance between two clusters.
Minimum pairwise distance between any member of cluster A and any member of cluster B (nearest‑neighbor criterion).
Define complete‑linkage distance between two clusters.
Maximum pairwise distance between any member of cluster A and any member of cluster B (furthest‑neighbor criterion).
What is Ward’s linkage rule in agglomerative clustering?
Merge the pair of clusters whose union yields the smallest increase in total within‑cluster variance (error sum of squares).
Give the high‑level pipeline of spectral clustering (normalized form).
(1) Build similarity graph W. (2) Form normalized Laplacian (L_{ ext{rw}}=I-D^{-1}W). (3) Compute first k eigenvectors. (4) Row‑normalize them. (5) Run K‑means on the rows.
What property links graph connectivity and Laplacian eigenvalues?
The number of connected components equals the multiplicity of eigenvalue 0 of L, (L_{ ext{rw}}), or (L_{ ext{sym}}).
Explain the eigengap heuristic for choosing k.
Pick k where a large gap appears between (lambda_k) and (lambda_{k+1}); small first k eigenvalues and big gap suggest k well‑separated clusters.
State one key advantage of a k‑nearest‑neighbor graph for spectral clustering.
It adapts to local density and can connect points across varying scales without a global distance threshold.
When prefer a mutual k‑nearest‑neighbor graph over standard k‑NN?
When you want to avoid connecting regions of very different density; an edge is kept only if both endpoints are among each other’s k nearest neighbors.
What does a silhouette coefficient near +1, 0, and –1 indicate?
+1: point is well inside its cluster; 0: on the border; –1: likely mis‑clustered.
State the random‑walk view of the Normalized Cut objective.
( ext{Ncut}(A,ar A)=P(ar A|A)+P(A|ar A)); minimizing it finds a split where a random walk rarely crosses clusters.
Give two reasons spectral clustering can beat plain K‑means on complex shapes.
(i) Uses graph connectivity rather than raw Euclidean geometry, handling non‑convex clusters. (ii) Eigenvector embedding separates intertwined structures before K‑means.
True/False: The normalized Laplacians (L_{ ext{rw}}) and (L_{ ext{sym}}) are always positive semidefinite.
True — all their eigenvalues are non‑negative with the smallest equal to 0.