Topic 9: Understanding GD: Overparameterisation Flashcards

Question 1

Q

What is the empirical loss function of linear regression

Answer

A

(Mean squared error)
R^(β) = 1/2 || y - X β||22

Question 2

Q

What assumptions do we make about the empirical loss function of linear regression

Answer

A

The model is Overparameterised
aka n (no of training data) < d (number of training parameters)

The data matrix X is full rank

Question 3

Q

What is an Invertible Matrix

Answer

A

(non-singular)
Given A, there exists a A-1
where: AA-1 = Identity matrix

Question 4

Q

m x n: which is column and row

Answer

A

row by column

Question 5

Q

What is meant by X has a trivial null space

Answer

A

It has only one element, the zero vector

Question 6

Q

What is a psuedo inverse

Answer

A

𝑋† = X⊺(XX⊺)^-1
Acts like an inverse in certain respects, even when X does not have a true inverse (is singular or not invertible)

Question 7

Q

What can we say about XX⊺

Answer

A

It has a trivial null space
It does not map non-zero vectors to zero
It is an invertible matrix

Question 8

Q

When is the R^(β) = 1/2 || y - X β||22 loss function at a global minima

Answer

A

When
β = X⊺ (XX⊺)^−1 y

Question 9

Q

How do we express the multiplicity of global minima

Answer

A

|| y - X β||22 = 0 (aka at a global minima)
when
β = X†y + ξ

Where ξ is s.t ξ⊺xi = 0 for all i

Question 10

Q

What can we say about β = X†y

Answer

A

When ξ = 0
β = X†y is global minima with the least norm
It is the global minima closest to the origin

Question 11

Q

What is 2-norm

Answer

A

The euclidean distance or standard vector length

Question 12

Q

What conditions must be met to converge to the global minimum with the least norm

Answer

A

The data matrix X must be full rank
Initially β 0 =0
There must exist a continuum of steps η

Question 13

Q

What is Implicit Bias

Answer

A

The inherent tendency of machine learning algorithms, particularly neural networks, to prefer certain solutions over others, even when these preferences are not explicitly programmed

Question 14

Q

What is Algorithmic Regularization

Answer

A

The phenomenon of algorithmic regularization emerges as a consequence of implicit bias

Refers to the regularization effect that involves preventing overfitting or improving generalisation performance

Question 15

Q

What is β and what are its dimensions

Answer

A

β is the linear predictor
β = d x 1

Question 16

Q

What is X and what are its dimensions

Answer

A

X is the input matrix
X = n x d

Question 17

Q

What is y and what are its dimensions

Answer

A

vector of labels
y = n x 1

Question 18

Q

What is ξ

Answer

A

A vector that is orthogonal (perpendicular) to each row of X (each data point)

Brainscape's Knowledge GenomeTM

Topic 9: Understanding GD: Overparameterisation Flashcards

Brainscape's Knowledge Genome^TM