Introduction to statistical learning Flashcards
What does learning mean?
using data
to gain models
types of statistical learning
- Supervised Learning
science of inferring a function from a set of labeled data
given {y1, y2, … , yN} corresponding to {u1, …, uN} find f such that y=f(u) - unsupervised Learning
science of inferring a function to desribe hidden structures from unlabeled data
given {u1, …, uN} find f such that y=f(u)
y is not an output but a property of the data - semi-supervised Learning
learn a function with mixed labeeled and unlabeled data - reinforcement Learning
learn a task based on a reward
framework of a supervised Learning problem
- input x: vector of attributes, x app X
- output y: y app Y
- target function f: X - > Y
- data: (x1,y1),…,(xN,yN)
- hypothesis g: X - > Y, g app H
- hypothesis set: H
Data are generated from an unknown target function; based on this data set, a Learning algorithm infer a function g starting from the hypothesis set H; H is selected based on the physical knowledge of the problem.
When is statistical Learning meaningful?
Statistical Learning is meaningful if:
1) a pattern exists
2) it is not possible to pin it down mathematically
3) data are available
In which sense is Learning feasible
From a deterministic point of view, Learning is not feasible: it is not possible to generalize from a set of examples
Learning is possible only in a statistical point of view
Hoeffding’s inequality
P[|p^-p| > ε] < = 2e^-2ε^2*N
where
p^ is the observed distribution from N data
p is the real probability distribution
ε is an arbitrary positive number
- the statement p^ = p is probably approximately correct
- to reduce ε, more data are needed
Probability of a generalization error
Given a model g in a set H of M models
Ein(g) is the in sample error
Eout(g) is the out of sample error
P[|Ein(g) - Eout(g)| > ε] < = 2Me^-2ε^2N
where M is the number of models in H
General objective of learning
Find a model g that
- minimizes the fitting error Ein(g)
- minimizes the generalization error |Ein(g) - Eout(g)|