Unsupervised Learning (Cluster Analysis) Flashcards

Notes from the Unsupervised Learning lecture that may help with the exam.

1
Q

What is the one line definition of Unsupervised Learning?

A

A type of algorithm that learns hidden patterns from unlabelled data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are two primary aspects of Unsupervised Learning?

A

Cluster Analysis - Divide data into meaningful groups
Dimensionality Reduction - PCA, Autoencoders, Generative Models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the 4 key aspects of Cluster Analysis?

A
  • Features that describe the subjects
  • Similarity Functions
  • Basic Clustering Algorithms
  • Cluster Validation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What 4 common Similarity Functions are in Unsupervised Learning?

A

Euclidean Distance
Cosine Distance
Manhattan Distance
Jaccard Distance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the equation for Euclidean Distance?

A

Euclidean Distance = Square Root of ((A - B)^T * (A - B))
Where:
A, B: Column Vectors that contain features of two data samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the equation for Cosine Distance?

A

Cosine Distance: cos(theta) = (A * B) / (||A|| * ||B||)
Where:
A, B: Column Vectors that contain features of two data samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the equation for the Manhattan Distance?

A

Manhattan Distance = 1/n (n sigma) |A - B|
Where:
A, B: Column Vectors that contain features of two data samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the equation for the Jaccard Distance?

A

Jaccard Distance = |A intersection B| / |A union B|
Where:
A, B: Column Vectors that contain features of two data samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the step-by-step method of K-Means Algorithm?

A
  1. Randomly select K points as the initial centroids (centres) for each of the K groups
  2. Repeat this process:
    a - Assign each point to its closest centroid (centre)
    b - Re-compute the centroid (centre) of each cluster
  3. Until centroids (centre) do not change
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the advantages of K-Means Algorithm?

A

Simple
Efficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the disadvantages of K-Means Algorithm?

A

Solution dependent on the initialisation

Need to specify number of clusters

Sensitive to Outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the step-by-step method of Agglomerative Hierarchical Clustering?

A
  1. Treat each data as a cluster, and compute the similarity matrix between each pair of data
  2. Repeat this process:
    a - Merge the closest two clusters
    b - Update the similarity matrix
  3. Repeat the above until only one cluster remains
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the advantages of Agglomerative Hierarchical Clustering?

A

Flexible with number of clusters

Can capture hierarchical relationship/s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the disadvantages of Agglomerative Hierarchical Clustering?

A

Solution is local optimum, dependent on subject functions e.g. minimum, maximum, group average, etc…
Requires larger memory and longer computational time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the step-by-step method for Density-Based Spatial Clustering of Applications with Noise (DBSCAN)?

A
  1. Find the neighbourhood points of every point with distance E
  2. Identify the core points with more than minimum number of points worth of neighbours
  3. Connect the core points if they are within distance E
  4. Make each group of connected core points into a separate cluster
  5. Assign each non-core point to a nearby cluster if they are within distance R, called border points
  6. Unassigned points are noise points
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the advantages of DBSCAN?

A

Robust to outliers

Learn non-regular population density patterns

Automatically determine the number of clusters

17
Q

What are the disadvantages of DBSCAN?

A

Not robust to variable density clusters (due to single E being used)

Computationally expensive

Sensitive to parameter settings

18
Q

What is the step-by-step method for Expectation Maximisation?

A
  1. Assume a distribution model that describes the data e.g. Gaussian
  2. Initialise the model parameters (mean and variance)
  3. Repeat this process:
    a - E step: Calculate the expected value (which class each data belongs to) of the log likelihood function with the current model parameters
    b - M step: Estimate model parameters that maximise the log likelihood function
  4. Until Convergence
19
Q

What are the advantages of Expectation Maximisation?

A

Soft clustering

20
Q

What are the disadvantages of Expectation Maximisation?

A

Restricted by the distribution model

Sensitive to initialisation

Need to specificity the number of clusters

21
Q

What is a sign of a good Clustering Analysis?

A

If the clustering algorithm separates dissimilar data samples apart and similar data samples together, then it has performed well.