Week 4 Flashcards

Question 1

Q

What does the out-of-sample error measure?

Answer

A

How well our training on the data set has generalized to data we have not seen before.

Question 2

Q

What does the in-sample error measure?

Answer

A

The training performance

Question 3

Q

Generalization error

Answer

A

The discrepancy between Ein and Eout.

Question 4

Q

What is the concept of the growth function?

Answer

A

A number that captures how different the hypotheses in H are, and how much overlap the different events have.

Question 5

Q

dichotomy

Answer

A

The N-tuple that you get when you apply a hypothesis h (in H) to a finite sample. It splits the sample in two groups: h=+1 and h=-1.

Question 6

Q

Wat representeert het aantal dichotomiën eigenlijk?

Answer

A

De verscheidenheid van de hypotheseklasse.

Question 7

Q

What is the growth function in the case of positive rays?

Answer

A

mH(N) = N + 1

Question 8

Q

What is the condition with positive rays?

Answer

A

H consists of all hypothesis h: R -> {-1, +1} of the form h(x) = sign(x-a).
-1 is returned left from a and +1 is returned right from a.

Question 9

Q

Describe the condition with positive intervals:

Answer

A

H consists of all hypotheses in one dimension that return +1 if in some interval and -1 otherwise.
Each hypothesis is specified by two end points of that interval.

Question 10

Q

How many different dichotomies are there in the positive intervals condition?

Answer

A

N+1 / 2 different dichotomies.

Question 11

Q

What is the growth function in the case of positive intervals?

Answer

A

mH(N) = 0.5N^2 + 0.5N + 1

Question 12

Q

What is the hypothesis space in the case of convex sets?

Answer

A

H consists of all hypotheses in two dimensions h: R^2 -> {-1,+1} that are positive inside some convex set and negative elsewhere.

Question 13

Q

When is a set convex?

Answer

A

If the line segment connecting any two points within the set lies entirely within the set.

Question 14

Q

What is the growth function in the case of convex sets?

Answer

A

mH(N) = 2^N

Question 15

Q

Gegeven een hypotheseklasse, de VC-dimensie dvc van die hypotheseklasse is…

Answer

A

het grootste aantal punten N zodat mH(N) = 2^N. Oneindig als voor alle N geldt dat mH(N) = 2^N.

Question 16

Q

Wat is het kleinste breekpunt, uitgedrukt in dvc?

Answer

A

k= dvc + 1

Question 17

Q

break point

Answer

A

When no data set of size k can be shattered by H, then k is the break point for H.

Question 18

Q

What kind of dvc do good models have?

Answer

A

A finite dvc.

Question 19

Q

What kind of dvc do bad models have?

Answer

A

A infinite dvc.

Question 20

Q

How do you compute a dvc for the perceptron?

Answer

A

First show that the dvc is at least a certain value, then show it is at most a certain value.

Question 21

Q

What can you conclude about dvc if there is a set of N points that can be shattered by h?

Question 22

Q

What can you conclude about dvc if any set of N points can be shattered by H?

Question 23

Q

What can you conclude about dvc if there is a set of N datapoints that cannot be shattered by H?

Question 24

Q

What can you conclude about dvc if there is no dataset of N points that can be shattered by H?

Question 25

Q

What is the VC dimension of a d-dimensional perceptron^3?

Question 26

Q

Pocket algorithm

Answer

A

Adjustment on PLA: Save the hypothesis with the smallest Ein

Question 27

Q

What is a plus of the pocket algorithm vs the PLA?

Answer

A

It works even if the data is not linearly separable

Question 28

Q

What is a disadvantage of the pocket algorithm>

Answer

A

It takes a long time to calculate Ein.

Question 29

Q

Wat is het plan om een kleine Eout te bereiken?

Answer

A

1) Zorg dat je de hypothese kiest met de kleinste Ein.
2) Zorg dat Eout ~=~ Ein

Question 30

Q

Als twee hypothesen op elkaar lijken…

Answer

A

dan zullen zowel de in-sample als de out-of sample errors op elkaar lijken de gebeurtenissen Bi veel overlappen.

Question 31

Q

Wat telt het aantal dichotomieën?

Answer

A

Het effectieve aantal hypothesen.

Question 32

Q

Wat beschrijft de groeifunctie voor concept?

Answer

A

Het is het grootste aantal dichotomieën dat je op N punten kunt krijgen, als je de punten zelf mag kiezen.

Question 33

Q

Wat is M?

Answer

A

Het aantal hypothesen in de hypotheseklasse

Question 34

Q

Hoe ga je om met dat M pessimistisch is?

Answer

A

Maak gebruik van het begrip ‘aantal effectieve hypothesen’ als vervanging voor M, zo ontwijk je de kans op grote afwijkingen van Eout ten opzichte van Ein.

Question 35

Q

Effectieve aantal hypothesen

Answer

A

Kies N datapunten en kijk alleen naar wat de hypothesen doen op die datapunten.

Question 36

Q

Wat is het verschil tussen een hypothese en een dichotomie?

Answer

A

Hypothese h: X -> {-1, +1}

Dichotomie h: {x1, x2, x…, xN} -> {-1, +1}

Question 37

Q

Wat is het maximale aantal dichotomieën voor een set van N punten?

Question 38

Q

Definieer de groeifunctie:

Answer

A

mH(N) = max voor x1, x…, xN in X |H(x1, x…, xN)|

Question 39

Q

Union bound

Answer

A

For any finite set of events, the probability that at least one of them happens is not greater than the sum of the probabilities of the individual events.

Question 40

Q

In what situation do we say that H shatters the data set of N points?

Answer

A

If H is capable of generating all possible dichotomies on those N points. (H is as diverse as it can be on this set, the max of 2^N dichotomies are realised).

Question 41

Q

If the condition mH(N) = 2^N breaks at any point, …

Answer

A

we can bound mH(N) for all values of N by a simple polynomial based on this break point.

Question 42

Q

What is the upper bound for any mH(N) that has a break point k?

Answer

A

mH(N) <_ B(N,k) met B(N,k) as the maximum amount of dichotomies on N datapoints such that no subset of size k of the N points can be shattered by these dichotomies.

Question 43

Q

Gegeven een hypotheseklasse, de VC-dimensie dvc van die hypotheseklasse is…

Answer

A

het grootste aantal punten N zodat mH(N) = 2^N. Oneindig als voor alle N geldt dat mH(N) = 2^N.

Question 44

Q

Geef weer hoe mH(N) gebonden wordt door de VC-dimensie:

Answer

A

mH(N) <_ N^dvc + 1

Question 45

Q

Kleinste kwadraten schatter

Answer

A

Wanneer je de gradiënt van de in-sample error = 0 oplost en dan wlin krijgt. Je kunt y voorspellen voor een willekeurige x.

Question 46

Q

Ruisterm

Answer

A

Een toevalsvariabele in lineare regressie, y = f(x) + epsilon. Epsilon is dan de ruisterm.

Question 47

Q

Wat representeert het aantal dichotomiën eigenlijk?

Answer

A

De verscheidenheid van de hypotheseklasse.

Question 48

Q

Hoe bepaal je de groeifunctie van een hypotheseklasse H?

Answer

A

1) Kies N=1, en 1 datapunt x->. Kun je beide dichotomiën realiseren met hypothesen uit H? Dan m(H) = 2^N = 2. Anders is N=2 een breekpunt.
2) Ga door voor N=2, N=3 etc. Als je nooit een breekpunt tegenkomt, dan geldt m(H) = 2^N.

Question 49

Q

Give the generalization bound:

Answer

A

E.out(g) <_ E.in(g) + the root of ( (1/2N) * ln(2M/delta) )
with probability of at least 1 - delta.