Probability Flashcards

1
Q

Mutually exclusive/disjoint

A

Two events cannot occur together.

I.e. coin flip heads or tails

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Finite Geometric Series

A

Sn=a1(1−r^n)/(1−r), r≠1

Given the geometric sequence 2,4,8,16,…2,4,8,16,… .

To find the common ratio, find the ratio between a term and the term preceding it.

r=4/2=2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Infinite Geometric Series

A

S=a1/(1−r)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Conditional Probability (Prob A occurs, given B is known)

A

When we know that B has occurred, every outcome that is outside B should be discarded. Thus, our sample space is reduced to the set B. Now the only way that A can happen is when the outcome belongs to the set A ∩ B.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Conditional Probability Rules

A

P(Ac|C) = 1 −P(A|C);
P(∅|C) = 0;
P(A|C) ≤ 1;
P(A −B|C) = P(A|C) − P(A ∩ B|C);
P(A ∪ B|C) = P(A|C) + P(B|C) − P(A ∩ B|C);
if A ⊂ B then P(A|C) ≤ P(B|C)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Conditional Probability Chain Rule (when we know the Cond Prob)

A

P(A ∩ B) = P(A)P(B|A) = P(B)P(A|B)

P(A ∩ B ∩ C) = P(A ∩ (B ∩ C)) = P(A)P(B ∩ C|A)

P(A1 ∩ A2 ∩⋯∩ An) = P(A1)P(A2|A1)P(A3|A2,A1)⋯

…P(An|An−1An−2⋯A1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

If A ⊂ B,

then P(B ∩ A) =

A

P(A)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

P(A −B)

A

Events in A minus events also in B

P(A −B) = P(A) −P(A ∩ B).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Closure (math)

A

In mathematics, closure describes the case when the results of a mathematical operation (function) are always defined.

For example, in ordinary arithmetic, addition on real numbers (domain) has closure: whenever one adds two numbers, the answer is a number. The same is true of multiplication.

Division does not have closure, because division by 0 is not defined.

In the natural numbers (domain), subtraction does not have closure, but in the integers (domain), subtraction does have closure. Subtraction of two numbers can produce a negative number, which is not a natural number, but is an integer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Prior

Posterior

Likelihood

Evidence Factor

(DATA603)

A

P(ωj)-Prior knowledge of how likely we are to achieve ωj state of nature.

P(ωj|x)- the state of nature being ωj given that feature value x has been measured.

p(x|ωj)- the category ωj for which p(x|ωj) is large is more “likely” to be the true category.

p(x)- a scale factor that guarantees that the posterior probabilities sum to one, as all good probabilities must.

Pattern Classification (p. 48). Wiley. Kindle Edition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Class-conditional probability density function,

A

x to be a continuous random variable whose distribution depends on the state of nature and is expressed as p(x|ω).*

probability density of measuring a particular feature value x given the pattern is in category ωi.

Pattern Classification (p. 47). Wiley. Kindle Edition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

(Joint) probability density of finding a pattern that is in category ωj and has feature value x… p(ωj, x). (DATA603)

A

p(ωj, x) = P(ωj|x)p(x) = p(x|ωj)P(ωj).

Conditional densities p(x|ωj) for j = 1, 2.

Pattern Classification (p. 47). Wiley. Kindle Edition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Bayes Formula

A

Pattern Classification (p. 48). Wiley. Kindle Edition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How to formally justify posteriors?

A

If P(ω2|x) > P(ω1|x), we would be inclined to choose ω2. To justify this decision procedure, let us calculate the probability of error whenever we make a decision.

Whenever we observe a particular x, the probability of error is Equation (4)

we can minimize the probability of error by deciding ω1 if P(ω1|x) > P(ω2|x) and ω2 otherwise. Of course, we may never observe exactly the same value of x twice. Will this rule minimize the average probability of error? Yes, because the average probability of error is given by Equation (5)

Pattern Classification (p. 49). Wiley. Kindle Edition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Bayes Decision Rule for minimizing probabiliy of error:

A

Under Bayes decision rule Equation (6), P(error|x) becomes equation (7)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

If for some x we have p(x|ω1) = p(x|ω2), then

A

that particular observation gives us no information about the state of nature;

in this case, the decision hinges entirely on the Prior probabilities.tion.

Pattern Classification (p. 51). Wiley. Kindle Edi

17
Q

If P(ω1) = P(ω2), then

A

the states of nature are equally probable; in this case the decision is based entirely on the likelihoods p(x|ωj).

Pattern Classification (p. 51). Wiley. Kindle Edition.

18
Q

Formally, the loss function, λ(αij) . states

A

exactly how costly each action is, and is used to convert a probability determination into a decision.

describes the loss incurred for taking action αi when the state of nature is ωj

Pattern Classification (p. 52). Wiley. Kindle Edition.

19
Q

In decision-theoretic terminology, an expected loss is called a _____, and R(αi|x) is called the _________.

Pattern Classification (p. 53). Wiley. Kindle Edition.

A

risk

conditional risk

we can minimize our expected loss by selecting the action that minimizes the conditional risk.

Pattern Classification (p. 53). Wiley. Kindle Edition.

20
Q

A general decision rule is a function α(x) that tells us

A

which action to take for every possible observation. To be more specific, for every x the decision function α(x) assumes one of the a values α1,…, αa.

The overall risk R is the expected loss associated with a given decision rule.

Pattern Classification (p. 53). Wiley. Kindle Edition.

21
Q

Because R(αi|x) is the conditional risk associated with action αi and because the decision rule specifies the action, the overall risk is given by

Pattern Classification (p. 53). Wiley. Kindle Edition.

A