Hierarchical clustering Flashcards by Megha Rathi

Hierarchical clustering algorithm operates in _______ fashion and why

Hierarchical clustering algorithms typically operate in a greedy fashion, making locally optimal choices at each step (merging the closest clusters or spitting the largest clusters) without reconsidering previous steps.

How well did you know this?

Not at all

Perfectly

Hierarchal clustering is __________-

divide and conquer clustering

How well did you know this?

Not at all

Perfectly

Another name of agglomerative clustering

Bottom up approach

How well did you know this?

Not at all

Perfectly

Another name of agglomerative clusetring

Top down approach

How well did you know this?

Not at all

Perfectly

Hierarchical clustering can be used for what or cant’s be used for what

Hierarchical clustering can be used for outlier detection but not for finding missing values (NA) or detected fake values.

How well did you know this?

Not at all

Perfectly

Hierarchical clustering is primarily used for ______ because ________

Hierarchical clustering is primarily used for exploration because it helps in understanding the natural grouping within data which can be very useful in exploratory data analysis.

How well did you know this?

Not at all

Perfectly

Hierarchical clustering is _________ visualization

Dendogram visualization

How well did you know this?

Not at all

Perfectly

In hierarchical clustering do we need to specify the number of clusters?

No need to specify the number of clusters in hierarchical clustering

How well did you know this?

Not at all

Perfectly

How hierarchical clustering provides flexibility or not

It allows you to choose the number of clusters by cutting the dendrogram at different levels, providing flexibility to explore the data at different granularities.

How well did you know this?

Not at all

Perfectly

Hierarchal clustering is deterministic or not

Hierarchal clustering is deterministic because it allows a fixed sequence of merging or splitting clusters based on defined criteria like distance

How well did you know this?

Not at all

Perfectly

Linkage (definition and types)

Linkage is how to link the clusters
Linkage techniques are two types: Single linkage and complete linkage

How well did you know this?

Not at all

Perfectly

Single linkage
* Another name
* Keyword
* Definition
* Formula

Another name: Nearest neighbour method
Keyword: shortest distance
Definition: This linkage technique focused on the shortest distance between data points in each cluster.

How well did you know this?

Not at all

Perfectly

Complete linkage
* Another name
* Keyword
* Definition
* Formula

Another name: Farthest neighbour method
Keyword: longest distance
Definition: This linkage technique focused on the longest distance between data points in each cluster.

How well did you know this?

Not at all

Perfectly

Agglomerative clustering keyword

Merging approach

How well did you know this?

Not at all

Perfectly

Agglomerative clustering use which linkage

can use any linkage
Single linkage or complete linkage

How well did you know this?

Not at all

Perfectly

Decisive clustering use which linkage

Study These Flashcards

Decisive linkage use only complete linkage

Remember point in agglomerative clustering problem

Study These Flashcards

Average linkage technique

Study These Flashcards

Decisive clustering keyword

Study These Flashcards

Splitting approach

How to do problem of decisive clustering

Study These Flashcards

We create Minimal spanning tree (MST) based on dissimilar matrix

Minimal spanning tree characteristics (4)

Study These Flashcards

It is a connected tree
No loops/ no closed circuits in the tree.
Each data point(node) in the tree is visited atleast once.
If ‘n’ nodes are present in the tree, then (n-1) edges are present or formed in the tree.

If there is n nodes in MST then

Study These Flashcards

If ‘n’ nodes are present in the tree, then (n-1) edges are present or formed in the tree.

Remember point in decisive clustering problem

Study These Flashcards

Explain number of levels in hierarchy in both agglomerative clustering and decisive clustering

Study These Flashcards

Agglomerative clustering: If there are n observations, then there will be n-1 levels in the hierarchy.Since n−1 merges are required to combine n observations into a single cluster, the hierarchy has n−1 levels.

Decisive clustering: The number of levels depends on the way splits occur (e.g., binary splits may create more or fewer than n−1 levels).

Ward's method

Merging cluster way. In this technique, we minimise the increase in variance when merging clusters. Repeat this process iteratively until all data points are in a single cluster or until reach the desired number of clusters. Similarity of two clusters is based on the increase in squared error when two clusters are merged. (Similar to group average if distance between points is distance squared). Less susceptible to noise and outliers. Biased towards global clusters. Hierarchical analogue of k-means (can be used to initialise k-means)

In Divisive clustering, explain split clusters in terms of variance