PAC Flashcards
PAC?
Probably Approximately Correct Learning
What is Sample Complexity?
The class of learnable concepts in terms of the number of sample points needed to achieve an approximate solution.
What is “X” or Input Space?
The set of all possible examples or instances.
What is “Y”?
The set of all possible labels or target values.
What is Concept?
A concept c is a mapping rom X to Y - X -> Y. For example all the points which fall into a triangle.
What is Concept Class?
Denoted as C. It is the set of concepts we may wish to learn. E.X be the set of all axis-aligned rectangles.
What is a Hypothesis Set?
Denoted as H. It is a fixed set of possible concepts which may not coincide with C.
What is Generalization Error?
Generalization error of hypothesis h also known as true error or just error is denoted by R(h) (risk).
R(h) = Pr [h(x) != c(x)] = E 1_ h(x) != c(x)
The Generalization error is not directly accessible to the learner since both the distribution D and the target concept c are unknown. GR is the expected error based on the distribution D. The expectation of the empirical error R^(h) is also the generalization error = E[R^(h)] = R(h).
What is Empirical Error?
R^(h) = 1 / m * SUM(i=1…m) 1 if h(xi) != c(xi) else 0
Thus Empirical error is the average error over the sample S.
What is the definition of PAC-learning?
A concept class C is said to be PAC-learnable if there exists an algorithm A and a polynomial function P such that for any epsilon > 0 and delta > 0 for all distributions D on X and for any target concept c WBI C, the following holds for any sample size m >= poly(1/eps, 1/delta, n, size(c)) Pr( [R(h_S) ) <= eps] >= 1 - delta
What is the term epsilon e?
Its the error term. It tells that we are approximately correct. How accurate the algorithm is (1 - eps).
What is the term delta d?
It tells that we are probably right (1 - delta). In other words the confidence of the algorithm.
What are false negatives?
Points that the hypothesis misses that are actually inside the concept class but not inside the hypothesis.
What are false positives?
Points that the hypothesis flags as being class A but in truth are not. So they are not inside the concept class A but are inside the hypothesis.
What is a consistent hypothesis?
It is a hypotheses which has empirical error R^(h) = 0.