Classification Flashcards

1
Q

Describe the steps of Image classification

A

1) Feature extraction
2) Feature description
3) Classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How can we classify images without labels

A

Use unsupervised clustering techniques

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What do we mean by semi-supervised learning?

A

We only have labels for a part of the dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is MNIST

A

A dataset of 70 000 28x28 pixel handwritten digits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe the typical pre-processing of digit recognition in general images

A
  1. Detect the digits in the large image
  2. Normalize the size of the digit, for example, to 28x,28 pixels
  3. Normalize the location, place mass center in the middle
  4. “Slant” Make the orientation canonical
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe K-Nearest Neighbour algorithm

A

Classify by taking a majority vote among the K-nearest neigbhours

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Name some distance measurements

A

L2 (Euclidean), L1(Manhatten)…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Name some advantages and disadvantages of K-Nearest Neighbour

A
  1. It works reasonably well
  2. No training required
  3. Nonlinear decision boundaries
  4. Multi-class
  5. All training data must be stored in memory
  6. Long evaluation time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why shouldn’t we use the test set to tune hyperparameters? What should we do instead?

A

Tuning hyperparameters with the test set overfits the model to the test set. Use a validation set or cross-validation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What’s the advantage and disadvantage of using cross-validation vs. a dedicated validation set

A

Cross-validation makes more data available for training but is computational more expensive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What’s the formula of a general linear classifier

A

w*x + b = y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
how can we use the formula of a linear classifier, 
wx + b = c to assign a class to the data?
A

in class (+1) if c>0, else not in class (-1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is considered the best hyperplane for SVMs?

A

The hyperplane that maximalizes the margin, the combined distance from the closest points in both classes to the hyperplane.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What points do we need to determine the hyperplane in an SVM?

A

Only the points closest to the hyperplane

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Derive the formula for making the margin equal to 2 in an SVM.

A

𝒘 ∙ 𝒙𝟏 + 𝑏 = -1
𝒘 ∙ (𝒙𝟏 + 𝑚𝒏) + 𝑏 = 1.
m 𝒘 ∙ 𝒏 = 2
𝒏 = 𝒘/||𝒘||

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How can we deal with outliers in SVM’s

A

We can introduce a slack variable which allows some misclassified points. This will increase the margin.

17
Q

Why can we always solve the minimization problem of the SVM, assuming that the classes are linearly separable?

A

The loss is convex

18
Q

How can we solve the minimization problem of the SVM loss in practice

A
  1. Gradient descent and the subgradient

2. Lagrangian duality

19
Q

Why do we use a kernel in SVMs?

A

Transforming the data to a higher dimension might make it linearly separable.

20
Q

Name some SVM kernels

A

Linear, Polynomial, Gaussian.

21
Q

Name some strategies for multi-class classification

A

1 vs. Rest

1 v 1 with a majority voting

22
Q

How large is the slack variable in an SVM when a point is on the wrong side of the separating hyperplane

A

slack > 1