Probability Flashcards
Mutually exclusive/disjoint
Two events cannot occur together.
I.e. coin flip heads or tails
Finite Geometric Series
Sn=a1(1−r^n)/(1−r), r≠1
Given the geometric sequence 2,4,8,16,…2,4,8,16,… .
To find the common ratio, find the ratio between a term and the term preceding it.
r=4/2=2
Infinite Geometric Series
S=a1/(1−r)
Conditional Probability (Prob A occurs, given B is known)
When we know that B has occurred, every outcome that is outside B should be discarded. Thus, our sample space is reduced to the set B. Now the only way that A can happen is when the outcome belongs to the set A ∩ B.
Conditional Probability Rules
P(Ac|C) = 1 −P(A|C);
P(∅|C) = 0;
P(A|C) ≤ 1;
P(A −B|C) = P(A|C) − P(A ∩ B|C);
P(A ∪ B|C) = P(A|C) + P(B|C) − P(A ∩ B|C);
if A ⊂ B then P(A|C) ≤ P(B|C)
Conditional Probability Chain Rule (when we know the Cond Prob)
P(A ∩ B) = P(A)P(B|A) = P(B)P(A|B)
P(A ∩ B ∩ C) = P(A ∩ (B ∩ C)) = P(A)P(B ∩ C|A)
P(A1 ∩ A2 ∩⋯∩ An) = P(A1)P(A2|A1)P(A3|A2,A1)⋯
…P(An|An−1An−2⋯A1)
If A ⊂ B,
then P(B ∩ A) =
P(A)
P(A −B)
Events in A minus events also in B
P(A −B) = P(A) −P(A ∩ B).
Closure (math)
In mathematics, closure describes the case when the results of a mathematical operation (function) are always defined.
For example, in ordinary arithmetic, addition on real numbers (domain) has closure: whenever one adds two numbers, the answer is a number. The same is true of multiplication.
Division does not have closure, because division by 0 is not defined.
In the natural numbers (domain), subtraction does not have closure, but in the integers (domain), subtraction does have closure. Subtraction of two numbers can produce a negative number, which is not a natural number, but is an integer.
Prior
Posterior
Likelihood
Evidence Factor
(DATA603)
P(ωj)-Prior knowledge of how likely we are to achieve ωj state of nature.
P(ωj|x)- the state of nature being ωj given that feature value x has been measured.
p(x|ωj)- the category ωj for which p(x|ωj) is large is more “likely” to be the true category.
p(x)- a scale factor that guarantees that the posterior probabilities sum to one, as all good probabilities must.
Pattern Classification (p. 48). Wiley. Kindle Edition.
Class-conditional probability density function,
x to be a continuous random variable whose distribution depends on the state of nature and is expressed as p(x|ω).*
probability density of measuring a particular feature value x given the pattern is in category ωi.
Pattern Classification (p. 47). Wiley. Kindle Edition.
(Joint) probability density of finding a pattern that is in category ωj and has feature value x… p(ωj, x). (DATA603)
p(ωj, x) = P(ωj|x)p(x) = p(x|ωj)P(ωj).
Conditional densities p(x|ωj) for j = 1, 2.
Pattern Classification (p. 47). Wiley. Kindle Edition.
Bayes Formula
Pattern Classification (p. 48). Wiley. Kindle Edition.
How to formally justify posteriors?
If P(ω2|x) > P(ω1|x), we would be inclined to choose ω2. To justify this decision procedure, let us calculate the probability of error whenever we make a decision.
Whenever we observe a particular x, the probability of error is Equation (4)
we can minimize the probability of error by deciding ω1 if P(ω1|x) > P(ω2|x) and ω2 otherwise. Of course, we may never observe exactly the same value of x twice. Will this rule minimize the average probability of error? Yes, because the average probability of error is given by Equation (5)
Pattern Classification (p. 49). Wiley. Kindle Edition.
Bayes Decision Rule for minimizing probabiliy of error:
Under Bayes decision rule Equation (6), P(error|x) becomes equation (7)