Week 1 Flashcards by Jedidja Marsman

Wat zijn de 5 onderdelen van een leerprobleem?

1) Input x in X.
2) Output y in Y.
3) Doelfunctie f: X -> Y.
4) De dataset met (x1, y1)…
5) De geleerde hypothese g: X -> Y.

How well did you know this?

Not at all

Perfectly

Wanneer is ML toepasbaar op een probleem? Geef 3 voorwaarden.

1) Er is een patroon.
2) Het lukt niet om dit patroon te beschrijven door het probleem te analyseren.
3) Er zijn gegevens waar we uit kunnen leren.

How well did you know this?

Not at all

Perfectly

What is H in ML?

The hypothesis set of candidate formulas that are under consideration.

How well did you know this?

Not at all

Perfectly

What is the role of h(x)?

A functional form that assigns the weights to the different components of the input vector.

How well did you know this?

Not at all

Perfectly

When is a dataset linearly separable?

There is a choice for the parameters that classifies all the training examples correctly.

How well did you know this?

Not at all

Perfectly

PLA

perception learning algorithm

How well did you know this?

Not at all

Perfectly

What is the goal of the perceptron learning algorithm?

Finding a hypothesis that classifies all the data points in data set D correctly.

How well did you know this?

Not at all

Perfectly

Supervised learning setting

When the training data contains explicit examples of what the correct output should be for given inputs.

How well did you know this?

Not at all

Perfectly

Waar bestaat de hypotheseruimte uit bij k-nearest neighbors?

Bijna alle functies van inputs naar outputs.

How well did you know this?

Not at all

Perfectly

Active learning

The data set is acquired by the learner through asking for a label for specific entries.

How well did you know this?

Not at all

Perfectly

What is the standard formula for h(x)?

h(x) = sign(w.T * x)

How well did you know this?

Not at all

Perfectly

Online learning

The data set is given to the algorithm one example at a time. Learning takes place as data becomes available.

How well did you know this?

Not at all

Perfectly

Transfer learning

When training an algorithm on data results in a model, and that model is used on a new problem or task. It uses the info learned on the first problem to improve on the second one.

How well did you know this?

Not at all

Perfectly

What is the update formula for w?

w(t+1) = w(t) + y(t)*x(t)

How well did you know this?

Not at all

Perfectly

Reinforcement learning

The training example does not contain the target output, but contains some possible output together with a measure of how good that output is.

How well did you know this?

Not at all

Perfectly

in-sample error

E.in(h): The error rate within a sample: the fraction of the data set where h and f disagree.
Example: the mistakes on a practice test.

How well did you know this?

Not at all

Perfectly

Wat is een pluspunt van het PLA?

Study These Flashcards

Het doorzoekt een oneindig grote verzameling hypothesen.

What kind of data do you need to use the Perceptron Learning Algorithm?

Study These Flashcards

Linearly separable data.

Give the formula for E.in(h):

Study These Flashcards

In-sample error:

= 1/N * the amount of datapoints in the sample where h(x) and f(x) disagree.

What does the out-of-sample error denote?

Study These Flashcards

How accurately the hypothesis function performs on data it hasn’t seen before.
Example: performance on exam.

What is the deterministic answer to ‘Does the data set D tell us anything outside of D that we didn’t know before?’?

Study These Flashcards

No.
D tells us something certain about f outside of D.

What is the probabilistic answer to ‘Does the data set D tell us anything outside of D that we didn’t know before?’?

Study These Flashcards

Yes.
D tells us something likely about f outside of D.

What are the two questions that present the feasability of learning?

Study These Flashcards

1) Can we make sure that Eout(g) is close enough to Ein(g)?
2) Can we make Ein(g) small enough?

What effect does a more complex H have?

Study These Flashcards

It gives more flexibility in finding some g that fits the data well, leading to small Ein(g).

What effect does a complex f have?

We get a worse value for Ein(g).

Noisy function

A function where the output is not uniquely determined by the input.

How does neigbors-based classification learn?

Does not attempt to construct a general or internal model, but simply stores instances of the training data.

Wat is mu in het model met knikkers en vazen?

De proportie rode knikkers in de vaas.

What is the formula for the line that classifies the datapoints?

w1x1 + w2x2 + b = 0

Wat is de onbekende in het knikkers/vazenmodel?

Mu, de proportie rode knikkers in de vaas.

Unsupervised learning

The training data does not contain any output information at all.

Data mining

A practical field that focuses on finding patterns.

What is nu in the marbles-vases model?

The fraction of red marbles within a random sample of N marbles that you pick from the vase.

What does the Hoeffding Inequality denote?

The maximum probability that the sum of the bounded independent random variables is different from its expected value more than a certain amount x. Example: for a fixed N, a higher precision of the difference between E.in & E.out makes the probability of that difference lower.

Why is the Hoeffding Inequality important for machine learning?

Through the Hoeffding Inequality, learning, generalizing to unknown data, is made possible without knowing the target function. This is because neither nu nor mu are needed to bound the probability in the H.I.

What is the Hoeffding Inequality used for?

It quantifies the relationship between nu and mu.

Give the Hoeffding Inequality:

P[|nu-mu| > epsilon] <_ 2e ^ (-2*epsilon ^2 *N) The probability that the difference between the fraction of red marbles in the random sample and the actual fraction of red marbles in the vase is bigger than epsilon, is smaller than or equal to the right-side. Epsilon is a positive value we choose, how much nu and mu can be different.

What is always true about E.in(h), E.out(h) and epsilon?

|E.in(h) - E.out(h)| > epsilon

If event B1 implies event B2, thus B1 -> B2, then...

The probability of event B1 is smaller or equal to the probability of event B2, thus P[B1] <_ P[B2]

Describe the nearest-neigbors algorithm:

Given an input, search for the closest input we have seen and copy that output.

Week 1 Flashcards

(40 cards)