5/6 - Unsupervised Learning in the era of generative AI Flashcards

1
Q

Unsupervised learning

A

Learning from data without labels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Unsupervised v supervised

A
  • Unsupervised: cheap; no definition of error; can discover new things
  • Supervised: expensive labelling; require def of error; can do as well as the labels
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Clustering

A

Detecting that data points can be grouped into distinct clusters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What metric is required for clustering?

A

Compute distances

For example euclidean distance or manhatten distance

Euclidean is length of the line between them
Manhatten is the distance if you had to go horizontal then vertical for example.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Hierarchical Clustering

A
  1. Each point is a cluster
  2. Compute all distances
  3. Find shortest distance among any two points and merge clusters.
  4. Recompute distance matrix with all points and new cluster
  5. repeat from 3.
  6. Until you have 1 then looking back you have a tree
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Clustering Means

A

K means
Hierarchical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

K means clustering

A
  1. Start with k known centroids manually chosen
  2. “place” each datapoint
  3. Compute the new barycentre of each cluster (new centroids)
  4. repeat point 2
  5. stop when centroids no longer move
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

principal component analysis/dimensionality reduction

A

Find the direction of maximum variation of the data and this we can reduce the dimensions (eg using an auto encoder) .

Finds the new axis and “rotates” the data to use this line as the x axis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Auto encoders

A

Neural Net that learns to compress/effectively represent data without labels

This is a neural network with input and output of the same dimensions and a bottleneck in the middle.

x1 x1*
\ /
.
/ \
x2 x2*

. is a latent value (reduced dimensions/compressed layer)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How does an auto encoder learn weights?

A

Backprop: Loss computed as the difference between input and reconstructed output eg L2=1/2(x*-x)^2

FOrward:
- Feed first datapoint to input,
- compute loss as above,
- then propagate gradients w/r to the weights.
-update weights by multiplying them with negative of the gradients (times a learning rate)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Can you perfectly capture data again after reducing the dimensionality to 1then back to 2?

A

No, there will be some variation .

You can produce new data points from the original that is not exactly the same.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Generative Adversarial Networks

Generator/Discriminator

A

Input random noise
network generates fake image

then discriminator network is given real and fake images.

You train generator to create images that produce a real response from the discriminator.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

GANs: Discriminator outputs

A

Discriminator outputs likelihood in range 0->1 that image is either fake (0) or real (1).

Generator wants to produce images that are given a 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Interpolating generator images

A

You can transition the input vector and sum or subtract values to produce all sorts of interpolations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Conditional GANs

A

Take additional input to specify which class of objects you want to generate.

With random vector, introduce condition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

pix2pix: Adversarial Loss

A

Ensures the generated images are indistinguishable from real to the discriminator

17
Q

pix2pix: L1/L2 Loss

A

Ensures images are structureally similar to target images

Penalises pixel-wise differences between generated and real images.

18
Q

CycleGAN

A

Two generators, two discriminators.

Ga->b translateas domain a to b
Gb->a translates domain b to a

Da differentiates between domain a and translated images.
Db in reverse

Cycle:
A loss penalises when an image converted and converted back does not equal the original image.

19
Q

Hierarchical Clustering: single linkage

A

the min dist between any two points from the two clusters

20
Q

Hierarchical clustering: Complete Linkage

A

is the max distance between any two points from the two clusters

21
Q

Hierarchical Clustering: Average Linkage

A

Average distance between all two points, one from each cluster.