Topic 3: Generalisation Bounds Flashcards

1
Q

What do generalisation bounds enable

A

Enable us to state with confidence that a model performs as we say it will - that it will maintain a certain level of accuracy
- very useful in safety critical or sensitive areas
Eg governmental/public services

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do we estimate how bad the testing error might get

A

R^(f,Sn) ~ R(f)
The law of large numbers states an estimate (empirical risk) of some number will approach the true value (population risk), as the sample size increases (n tends to infinity)
use hoeffdings inequality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Hoeffding’s LHS inequality in algebraic terms

A

P(|E[X] - 1/nΣxi| ≥ ε ) ≤ δ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Hoeffding’s inequality in words

A

Given a true value E[Z], and an empirical estimate 1/n ∑ zi, the probability of the estimate
deviating by more than ϵ from the true value is less than or equal to δ.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is δ

A

The failure threshold (or failure probability)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is ε

A

Some tolerance value
We want E[Z] - 1/n∑zi to be within this threshold

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the relationship between n and δ

A

As n increases, δ decreases to almost 0
(note - use a fixed ε)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do we rearrange hoeffdings equation for ε

A

ε = sqrt(ln(2/δ) / 2n )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does δ = 0.01 in english terms

A

We want to be 99% confident

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the relationship between n and ε

A

As n decreases, ε will decrease to small value

(We can fix a δ eg δ=0.01)
eg the line will asimptote at ε = 0.03 which means we can be 99% confident that the sample risk will be within 0.03 of the true risk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How can we solve for n

A

Alternatively, we can fix δ eg 0.05 (95% confidence) and as we can increase ε on the x axis we can see how many samples (n) we need to be within some ε of the true risk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What do we call F

A

Function class

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the formula for Union Bound (Boole’s inequality)

A

P ((M⋃i=1) Vi) ≤ (M∑i=1)P (Vi)

Essentially the probability of one of at least one of the values v1 t vm being true
is less than or equal to
the total sum of all individual probabilities (p(v1) + p(v1)…+p(vm))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Uniform Deviation Bound (Hoeffdings equation in terms of function class F)

A

P(∃f ∈|R(f) - R^(f,Sn)| ≥ ε ) ≤ δ = 2|F|exp(-2nε^2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the uniform deviation bound in English terms

A

The probability that there exists a model which violates our tolerance zone is less than or equal to δ

We can also say with 1-δ probability that all of our estimates are within the zone
i.e. ∀f, we have |R(f) - R^(f,Sn)| < ε

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What assumption are we making about F

A

The space F is finite

17
Q

What is the formula for the generalisation bound for finite function classes

A

R(f) ≤ R^(f,Sn) + √ (ln(|F|) + ln(1/δ) / 2n )

18
Q

What does the generalisation bound for function classes mean in English terms

A

The true error is less than the training error plus a term, dependent on the number of training
samples and the function capacity”.

19
Q

How do size of model and confidence in estimate error relate

A

If we increase the size of the model, model becomes more flexible and may overfit

our confidence drops

Unless we also increase the size of the data alongside it (n) - mitigates overfitting

20
Q

How do we generally relate population and empirical risk in a formula

A

population risk ≤empirical risk + ε

21
Q

What is hoeffdings inequality RHS in algebraic terms

A

2exp(-2nε^2)

22
Q

what is the uniform deviation bound in English terms

A

the probability that there exists a model which violates our tolerance zone is less than or equal
to δ. Equivalently, we can say that with probability 1 − δ, all of our estimates are within the zone, i.e. ∀f,
we have ∣R(f) − Rˆ(f,Sn)∣ < ϵ

23
Q

What does empirical mean

A

Rrefers to something that is derived from or based on observed data rather than theoretical assumptions