06 Data Mining / Metric Clustering Flashcards

Question 1

Q

Explain briefly the following three characterizations for clustering-approaches / methods:

Answer

A

Exclusive vs. non-exclusive
Crisp vs. fuzzy
Hierarchical vs. non-hierarchical
Exclusive: non overlapping clusters
non-exclusive: overlapping clusters
Crisp: Conventional characteristic functions α_k: X -> {0, 1} for each cluster C_k
(element is either in cluster or not)
Fuzzy: fuzzy membership function α_k: X -> [0, 1] for each cluster C_k (element
belongs to multiple clusters with different weight)
Hierarchical: imposes tree structure (Dendrogram) on C_k
Non-hierarchical: doesn’t impose tree structure on C_k

Question 2

Q

The local cost functions for Single Linkage (min) and Complete Linkage (max)
are defined as **(pic) **where d (u, v) denotes the
length of the shortest path between nodes u and v. State the analogous local cost
functions if nodes correspond to pattern vectors x_u and x_v in ℝ^n!

Question 3

Q

State and explain two advantages of DBSCAN (compared to K-Means)! (1 sentence each)

Answer

A

DBSCAN does not need to know K in advance, K-means does, which makes
DBSCAN easier to compute
Arbitrarily shaped clusters (DBSCAN) instead of spherical clusters (K-means)
(improved clustering quality)
DBSCAN has notion of noise, K-means doesn’t (improved clustering quality)

Question 4

Q

Fuzzy C-Means:
a) What is the range of values of the entries membership matrix r_nk compared to KMeans?
b) Fuzziness-parameter m in the objective function: Informally: what happens in the limits
m -> 1 and m-> infinity? (No full mathematical derivation necessary!)
c) Explain the pragmatics behind the (following) additional conditions

Answer

A

a) Range of values is between [0,1] instead of pure {0, 1}
(K-means is crisp, fuzzy C-means is fuzzy)
b) m models degree of fuzziness:
m -> 1: K-Means (crisp case)
m -> ∞: r_nk -> 1/K (where K is the number of clusters) (totally fuzzy)
c) The first condition makes it so that the membership of an element to the
clusters can be seen as percentages and therefore as probabilities as well. The
second condition makes sure that every cluster has at least one element has a
“chance” to be in cluster C_k (in other words no cluster is useless)

Question 5

Q

What are advantages of Gaussian Mixture Models compared to Fuzzy C-Means?
Informally explain the nature and geometrical interpretation of the GMM quantities μ_k
and Σ_k!

Answer

A

GMM does not favor spherical clusters like Fuzzy C-Means -> better clustering
quality
μ_k: mean vectors
Σ_k: covariance matricies

Question 6

Q

What is the paradigm of the „Maximum Likelihood concept “?

Answer

A

It’s a method of estimating the parameters of a probability distribution by
maximizing a likelihood function
Finding the parameters that best explain the data

Question 7

Q

What does „iid“ mean? What are the consequences here? Can You point them out in an
expression on these slides?

Answer

A

Iid: “identically independently drawn” ->

Question 8

Q

Why do we take the logarithm of the likelihood? How do we choose the base of the
logarithm? Why?

Answer

A

We use the monotonicity of the logarithm to get the maximum of the function (first
derivation = 0). This is a lot easier with log and we get the same position.
The base is e because the likelihood function is exponential

Question 9

Q

Case ONE Gaussian: Are the resulting equations directly solvable?

Answer

A

No because the equations depend on each other and form an “infinite loop”

Question 10

Q

What is a 1 of K representation?

Answer

A

It is a vector which has an x_i = 1 and for all other x_j ≠ x_i: x_j = 0. (There are k
different possibilities)

Question 11

Q

Can You explain the formula for the responsibilities from an intuitive point of view?

Answer

A

How much is cluster k responsible for element x
Probability that element x belongs to cluster k

Question 12

Q

Why don’t we have the full joint probability distribution p(X, Z | θ)?

Answer

A

We don’t know the full data set {X, Z}

06 Data Mining / Metric Clustering Flashcards

(12 cards)