8. Unsupervised Learning Flashcards

1
Q

Purpose of PCA

A

( Principal Component Analysis is ordered )
-> finds a sequence of linear combinations of the variables that have maximal variance, are mutually uncorrelated

  1. Data visualization / pre-processing before supervised techniques are applied.
  2. Dimensionality reduction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Purpose of clustering

A

Discovering unknown sub-groups (homogenous clusters) in data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Unsupervised Learning methods

A

PCA
Clustering

( Data observation and low-complexity data description )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

SVM vs. Logistic Regression for (almost) separable classes

A

SVM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

SVM vs. Logistic Regression for non-separable classes

A

SIMILAR
SVM and Logistic regression (with ridge penalty)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

SVM vs. Logistic Regression for estimating probabilites

A

Logistic regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

SVM vs. Logistic Regression for fast and interpretable model

A

Logistic regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

SVM vs. Logistic Regression for non-linear boundaries

A

kernel SVM’s

( kernel Logistic regression expensive )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

K-means clustering partition requirements

A
  1. Each value belongs to at least one cluster
  2. Each value belongs to only 1 cluster

(no-overlap)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Clustering potential issues

A
  1. Standardize observations first?
  2. How many clusters?
  3. What type of linkage / dissimilarity measure?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Within-Cluster-Variation formula

A

-

( usually a cumulative value is reported )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

K-means clustering Vorgehensweise

A
  1. Initial clustering - Randomly assign a number from 1 to K to each of the observations.
  2. Compute cluster centroids
  3. Assign observations to closest centroid (using eucledian distance)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Hierarchical clustering Vorgehensweise

A
  1. Treat each observation as own cluster
  2. Measure pairwise dissimilarities (i.e. eucledian distance)
  3. Fuse most similar clusters, and re-compute

(benefit is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Linkage / Dissimilarity measures

A
  1. Complete: maximal intercluster dissimilarity
  2. Single: minimal intercluster dissimilarity
  3. Average: mean intercluster dissimilarity
  4. Centroid: dissimilarity between the centroid for the cluster
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Vorgehensweise of Hyperplane / SVM

A

Among all separating hyperplanes, find the one that makes the biggest gap or margin between the two classes == Maximal Margin Classifier

If not possible:
loosen „separate“ requirement (slack variables)
enlarge the feature space so that separation becomes possible (e.g. feature expansion woth transformed variables -> non-linear boundaries)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Popular kernel functions

A
  • Linear kernel (standard for linear classification)
  • Radial Basis Function
  • Polynomial