lecture 6: ridge regression and polynomial regression Flashcards

1
Q

learning of a vectored function is the same as a scalar function apart from

A

the output of w, it is a matrix instead of a vector

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

using matrix notation, what is the sum of squared error cost function

A

E = WX - Y

capital letters represent matrices

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

the formulas for W are the same as those for w, TRUE or FALSE

A

TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

recap: what type of problem does it correspond to when yᵢ is continuous valued vs discrete valued

A

regression for continuous and classification for discrete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are the 2 linear methods for classification

A

binary classification and multi-category classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are the 2 classifications for binary classification?

A

yᵢ ∈ {-1,1}

for the value(s) of y derived, take the sign as the answer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is the method of assignment used for multi-category classification

A

one-hot encoding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

with the final y matrix obtained, how do we classify each item?

A
using argmax, for each row the column with the largest number determines the class label
if the largest number is in column 1, item is class 1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

why do we use ridge regression?

A

we cannot guarantee that XᵀX is invertible, ridge regression ensures that whatever is in the bracket is invertible by adding an identity matrix with a minimised coefficient 𝜆

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is the term added for minimisation of the coefficient of identity matrix 𝜆

A

𝜆wᵀw

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

ridge regression in primal form

hint: similar to over determined system

A

w = (XᵀX + 𝜆I)⁻¹ Xᵀy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

ridge regression in dual form

hint: similar to under determined system

A

w = Xᵀ(XXᵀ + 𝜆I)⁻¹ y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

why do we use polynomial regression

A

to try and get a better fit for the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

generally, for high dimensional problems, polynomials of order larger than what is seldom used

A

3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly