Lecture 6 - Linear Models Flashcards

1
Q

What is a linear model?

A

f(x) = w1x1 + w2x2… + b, where x are features, w are the coefficients and b is intercept term

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is usually the error measure for regression line?

A

Mean square error in y-direction, meaning that the sum of squared error in y-direction is minimized.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the problem with linear regression?

A

It is prone to overfitting in high dimensions, and it is sensitive to outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is ridge regression?

A

It is a regularized least-squares, meaning that it adds regularization term to the mean squared error that penalizes too complex models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the advantages of ridge regression?

A

Provides more robust solution than least-squares especially for high-dimensional data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the computational complexity for ridge regression?

A

O(d^3 + d^2n), so feasible if dimensionality d is not too large

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Lasso?

A

Lasso is basically the ridge regression, but instead of having w^2 in the regularization, it uses absolute value of w.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is elastic net?

A

Combination of ridge regression and lasso

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is support vector regression?

A

Uses ridges regularizer, but only penalizes error that are higher than tolerance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a linear classifier?

A

A linear model that defines a hyperplane that splits data into 2 classes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a 0-1 loss function?

A

If y - pred(y) < 0, then 1 (FALSE) else 0 (TRUE). This means that there is a loss of 1 when the prediction is false.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is squared loss?

A

(pred(y) - y)^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How does support vector machine work? What is Hinge loss?

A

It maximises the margin separating the classes.
Hinge loss: max(0, 1-y*pred(y))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is sigmoid function?

A

1/(1+e^-z), always gives values between 0 and 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is logistic regression?

A

Logistic regression tries to minimize the logarithm of the likelihood of training data, and gives predictions scaled between 0 and 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is multti-class classification?

A

Predicts exactly one class (out of many classes): This animal is a dog

17
Q

What is multi-label classification?

A

Predicts a subset of classes. This picture contains a dog and a cat.

18
Q

How to predict with multiple classes?

A

Reduct to binary classification, for example
image 1: Dog 1, cat 0, rabbit 0, hamster 0.
One hot encoding for classes, choose the one with the highest probability.

For multi-label have 1 for multible labels, like Dog 1, cat 1, rabbit 0, hamster 0

19
Q
A