Lesson 4 - Aproximation and bounds on probability Flashcards

1
Q

What does PAC mean?

A
  • Probably Approximately Correct
  • We assume σ sample probability and π real probability
    • in a large sample (large N), the value σ is likely close to π
    • P(|σ−π|>ε)≤2e^(−2Nε^2 ) [Hoeffding’s inequality]
    • σ= π is PAC
    • an aproximation correct (close to true) and probable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

PAC learning

A
  • the learner receives samples and must select a generalization function (called the hypothesis) from a certain class of possible functions
    • low generalization error
  • Hoeffding’s Inequality with in-sample and out-of-sample errors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Union bound

A
  • for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the individual events
  • upper bound on the probability of occurrence of at least one of a countable number of events in terms of the individual probabilities of the events
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Shattering, formulation and what is it used for?

A
  • S is a set of samples and is shattered by the hypothesis space H if for every possible dichotomy an hypothesis h exists that can implement it
  • Given S⊂X, S is shattered by the hypothesis space H iff ∀S^′⊆S, ∃h∈H, such that ∀x∈S, h(x)=1⇔x∈S′ (H is able to implement all possible dichotomies of S)
  • used to measure the complexity of a hypothesis space
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the VC-dimension?

A
  • Vapnik-Chervonenkis
  • measure of the capacity of an hypothesis space/functions
  • The VC-dimension of a hypothesis space H defined over an instance space X is the size of the largest finite subset of X shattered by H: VC(H)= max( S⊆X) |S|:S is shattered by H
  • If arbitrarily large finite sets of X can be shattered by H, then VC(H)=∞
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Generalization Error

A
  • upper bound on ideal error, is valid with probability (1 − δ)
    • δ arbitrarily small
  • error(g) ≤ errorS (g) + F(VC(H)/n, δ)
    • depends on the hypothesis (g)
    • VC-confidence
      • training size (n) - inv
      • VC dimension - prop
      • confidence δ - inv
  • Problem: VC-dimension grows -> empirical risk (errorS (g)) decreases -> VC confidence increases
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Structural Risk Minimization

A
  • new inductive principle
  • get a tradeoff between empirical risk and VC confidence
  • select the hypothesis with the smallest bound on the true risk
How well did you know this?
1
Not at all
2
3
4
5
Perfectly