intermediate probability Flashcards

1
Q

conditional probability

A

how does the probability of the event changes if we have extra information?

The conditional probability of 𝐴 knowing that 𝐡 occurred is written
𝑃(𝐴|𝐡)

the conditional probability of 𝐴 given 𝐡

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

formal definition of conditional probability

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

multiplication rule

A

𝑃(𝐴 ∩ 𝐡) = 𝑃(𝐴|𝐡) β‹… 𝑃(𝐡).

This is simply a rewriting of the definition in Equation (1) of conditional probability.

We will see that our use of the multiplication rule is very similar to our use of the rule of
product in counting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

law of total probability

A

Suppose the sample space Ω is divided into 3 disjoint events 𝐡1, 𝐡2, 𝐡3 (see the figure
below). Then for any event A:

𝑃 (𝐴) = 𝑃 (𝐴 ∩ 𝐡1) + 𝑃 (𝐴 ∩ 𝐡2) + 𝑃 (𝐴 ∩ 𝐡3)

𝑃(𝐴) = 𝑃(𝐴|𝐡1) 𝑃(𝐡1) + 𝑃(𝐴|𝐡2) 𝑃(𝐡2) + 𝑃(𝐴|𝐡3) 𝑃(𝐡3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

An urn contains 5 red balls and 2 green balls. Two balls are drawn one after
the other. What is the probability that the second ball is red?

A

The sample space is Ξ© = {rr, rg, gr, gg}.
Let 𝑅1 be the event β€˜the first ball is red’, 𝐺1 = β€˜first ball is green’, 𝑅2 = β€˜second ball is
red’, 𝐺2 = β€˜second ball is green’. We are asked to find 𝑃(𝑅2).

Let’s compute this same value using the law of total probability (3). First, we’ll find the conditional probabilities. This is a simple counting exercise.
𝑃(𝑅2|𝑅1) = 4/6, 𝑃(𝑅2|𝐺1) = 5/6.

Since 𝑅1 and 𝐺1 partition Ξ© the law of total probability says
𝑃(𝑅2) = 𝑃(𝑅2|𝑅1)𝑃(𝑅1) + 𝑃(𝑅2|𝐺1)𝑃(𝐺1) = 5/7

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

probability urns

A

In probability and statistics, an urn problem is an idealized mental exercise
in which some objects of real interest (such as atoms, people, cars, etc.) are
represented as colored balls in an urn or other container. One pretends to draw (remove) one or more balls from the urn; the goal is to determine the probability of drawing one color or another, or some other properties. A key parameter is whether each ball is returned to the urn after each draw.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

An urn contains 5 red balls and 2 green balls. A ball is drawn. If it’s green
a red ball is added to the urn and if it’s red a green ball is added to the urn. (The original ball is not returned to the urn.) Then a second ball is drawn. What is the probability the second ball is red?

A

The law of total probability says that 𝑃(𝑅2) can be computed using the expression in Equation (4). Only the values for the probabilities will change. We have
𝑃(𝑅2|𝑅1) = 4/7, 𝑃(𝑅2|𝐺1) = 6/7.

Therefore, 𝑃(𝑅2) = 𝑃(𝑅2|𝑅1)𝑃(𝑅1) + 𝑃(𝑅2|𝐺1)𝑃(𝐺1) = (4/7)*(5/7) +(6/7)(2/7) = 32/49

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

trees to organize computation

A

consider the case of the urn with 5 red balls and two green balls. if the first ball drawn is green, a red ball is added to the urn, and if the first ball drawn is red, a green ball is added to the urn. what is the probability of drawing a red ball second?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

probability tree

A

node - Each dot is called a node.
levels - The tree is organized by levels.

The top node (root node) is at level 0. The next layer down is level 1 and so on. Each level shows the outcomes at one stage of the game. Level 1 shows the possible outcomes of the first draw. Level 2 shows the possible outcomes of the second draw starting from each node in level 1. Probabilities are written along the branches. The probability of 𝑅1 (red on the first draw) is 5/7.

It is written along the branch from the root node to the one labeled 𝑅1. At the
next level we put in conditional probabilities. The probability along the branch from 𝑅1 to 𝑅2 is 𝑃(𝑅2|𝑅1) = 4/7. It represents the probability of going to node 𝑅2 given that you are already at 𝑅1.
The multiplication rule says that the probability of getting to any node is just the product of the probabilities along the path to get there. For example, the node labeled 𝑅2 at the far left really represents the event 𝑅1 ∩ 𝑅2 because it comes from the 𝑅1 node. The multiplication rule now says

𝑃(𝑅1 ∩ 𝑅2) = 𝑃 (𝑅1) β‹… 𝑃 (𝑅2|𝑅1)

The law of total probability is just the statement that 𝑃(𝑅2) is the sum of the probabilities of all paths leading to 𝑅2 (the two circled nodes in the figure).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

independence

A

Two events are independent if knowledge that one occurred does not change the probability that the other occurred.

Informally, events are independent if they do not influence one another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

formulation of independence

A

𝑃(𝐴 ∩ 𝐡) = 𝑃 (𝐴) β‹… 𝑃 (𝐡)

  1. If 𝑃 (𝐡) β‰  0 then 𝐴 and 𝐡 are independent if and only if 𝑃(𝐴|𝐡) = 𝑃(𝐴).
  2. If 𝑃 (𝐴) β‰  0 then 𝐴 and 𝐡 are independent if and only if 𝑃(𝐡|𝐴) = 𝑃(𝐡).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Toss a fair coin twice. Let 𝐻1 = β€˜heads on first toss’ and let 𝐻2 = β€˜heads on
second toss’. Are 𝐻1 and 𝐻2 independent?

A

Since 𝐻1 ∩ 𝐻2 is the event β€˜both tosses are heads’ we have

𝑃(𝐻1 ∩ 𝐻2) = 1/4 = 𝑃(𝐻1)𝑃(𝐻2).

Therefore the events are independent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Toss a fair coin 3 times. Let 𝐻1 = β€˜heads on first toss’ and 𝐴 = β€˜two heads
total’. Are 𝐻1 and 𝐴 independent?

A

We know that 𝑃(𝐴) = 3/8. Since this is not 0 we can check if the formula 𝑃(𝐴|𝐡) = 𝑃(𝐴) holds.

Now, 𝐻1 = {HHH, HHT, HTH, HTT} contains exactly two outcomes (𝐻𝐻𝑇 , 𝐻𝑇 𝐻) from 𝐴, so we have 𝑃(𝐴|𝐻1) = 2/4.

Since 𝑃(𝐴|𝐻1) β‰  𝑃(𝐴) these events are not independent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Draw one card from a standard deck of playing cards. Let’s examine the
independence of 3 events β€˜the card is an ace’, β€˜the card is a heart’ and β€˜the card is red’.

Define the events as 𝐴 = β€˜ace’, 𝐻 = β€˜hearts’, 𝑅 = β€˜red’.

A

(a) We know that 𝑃(𝐴) = 4/52 (4 out of 52 cards are aces), 𝑃(𝐴|𝐻) = 1/13 (1 out of 13 hearts are aces). Since 𝑃 (𝐴) = 𝑃 (𝐴|𝐻) we have that 𝐴 is independent of 𝐻.

(b) 𝑃(𝐴|𝑅) = 2/26 = 1/13 = 𝑃(𝐴). So 𝐴 is independent of 𝑅. That is, whether the card is an ace is independent of whether it is red.

(c) Finally, what about 𝐻 and 𝑅? Since 𝑃 (𝐻) = 1/4 and 𝑃 (𝐻|𝑅) = 1/2, 𝐻 and 𝑅 are not independent. We could also see this the other way around: 𝑃(𝑅) = 1/2 and 𝑃 (𝑅|𝐻) = 1, so 𝐻 and 𝑅 are not independent. That is, the suit of a card is not independent of the color of the card’s suit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

paradoxes of independence

A

An event 𝐴 with probability 0 is independent of itself, since in this case both sides of equation (6) are 0. This appears paradoxical because knowledge that 𝐴 occurred certainly gives information about whether 𝐴 occurred. We resolve the paradox by noting that since 𝑃(𝐴) = 0 the statement β€˜π΄ occurred’ is vacuous.

Think: For what other value(s) of 𝑃(𝐴) is 𝐴 independent of itself?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

baye’s theorem

A

For two events 𝐴 and 𝐡 Bayes’ theorem (also called Bayes’ rule and Bayes’
formula) say (reference the fig)

  1. Bayes’ rule tells us how to β€˜invert’ conditional probabilities, i.e. to find
    𝑃(𝐡|𝐴) from 𝑃(𝐴|𝐡).
  2. In practice, 𝑃(𝐴) is often computed using the law of total probability.
17
Q

proof of baye’s theorem

A

The key point is that 𝐴 ∩ 𝐡 is symmetric in 𝐴 and 𝐡. So the multiplication rule says
𝑃(𝐡|𝐴) β‹… 𝑃(𝐴) = 𝑃(𝐴 ∩ 𝐡) = 𝑃(𝐴|𝐡) β‹… 𝑃(𝐡).
Now divide through by 𝑃(𝐴) to get Bayes’ rule.
A common mistake is to confuse the meanings of 𝑃(𝐴|𝐡) and 𝑃(𝐡|𝐴). They can be very
different. This is illustrated in the next example.

18
Q

Toss a coin 5 times. Let 𝐻1 = β€˜first toss is heads’ and let 𝐻𝐴 = β€˜all 5 tosses
are heads’. Then 𝑃(𝐻1|𝐻𝐴) = 1 but 𝑃(𝐻𝐴|𝐻1) = 1/16.

use Bayes’ theorem to compute 𝑃(𝐻1|𝐻𝐴) using 𝑃(𝐻𝐴|𝐻1)

A

The terms are 𝑃(𝐻𝐴|𝐻1) = 1/16, 𝑃(𝐻1) = 1/2, 𝑃(𝐻𝐴) = 1/32.

19
Q

base rate fallacy

A

The base rate fallacy is one of many examples showing that it’s easy to confuse the meaning of 𝑃(𝐡|𝐴) and 𝑃(𝐴|𝐡) when a situation is described in words. This is one of the key
examples from probability and it will inform much of our practice and interpretation of statistics. You should strive to understand it thoroughly.

20
Q

Consider a routine screening test for a disease. Suppose the frequency of the disease in the population (base rate) is 0.5%. The test is fairly accurate with a 5% false positive rate and a 10% false negative rate.

You take the test and it comes back positive. What is the probability that you have the disease?

A

𝐷+ = β€˜you have the disease’
π·βˆ’ = β€˜you do not have the disease
𝑇 + = β€˜you tested positive’
𝑇 βˆ’ = β€˜you tested negative’.

We are given 𝑃(𝐷+) = 0.005 and therefore 𝑃(π·βˆ’) = 0.995. The false positive and false
negative rates are (by definition) conditional probabilities.

𝑃(false positive) = 𝑃(𝑇 +|π·βˆ’) = 0.05 and 𝑃(false negative) = 𝑃(𝑇 βˆ’|𝐷+) = 0.1.

The complementary probabilities are known as the true negative and true positive rates:

𝑃(𝑇 βˆ’|π·βˆ’) = 1 βˆ’ 𝑃(𝑇 +|π·βˆ’) = 0.95
𝑃(𝑇 +|𝐷+) = 1 βˆ’ 𝑃(𝑇 βˆ’|𝐷+) = 0.9.

Using baye’s theorem on these probabilities to get 𝑃(𝐷+|𝑇 +) we calculate around 8.3%!!

21
Q

remarks on base rate fallacy

A

This is called the base rate fallacy because the base rate of the disease in the population is so low that the vast majority of the people taking the test are healthy, and even with an accurate test most of the positives will be healthy people. Ask your doctor
for his/her guess at the odds.

To summarize the base rate fallacy with specific numbers 95% of all tests are accurate does not imply 95% of positive tests are accurate

We will refer back to this example frequently. It and similar examples are at the heart of many statistical misunderstandings.

22
Q

toss a fair coin 3 times

what is the probability of 3 heads?

A

Sample space Ξ© = {𝐻𝐻𝐻, 𝐻𝐻𝑇 , 𝐻𝑇𝐻, 𝐻𝑇𝑇 , 𝑇𝐻𝐻, 𝑇𝐻𝑇 , 𝑇𝑇𝐻, 𝑇𝑇𝑇 }.

All outcomes are equally probable, so 𝑃 (3 heads) = 1/8.

23
Q

toss a fair coin 3 times

Suppose the first toss was heads

given this info how should we compute the probability of 3 heads?

A
24
Q

conditional probability

A

takes into account additional conditions

25
Q

the conditional probability of A knowing that B occurred is written

A

𝑃 (𝐴|𝐡)

β€˜the conditional probability of 𝐴 given 𝐡’

26
Q

Draw two cards from a deck. Define the events: 𝑆1 = β€˜first card is a spade’ and 𝑆2 = β€˜second card is a spade’. What is the 𝑃 (𝑆2|𝑆1)?

A

Refer to other cards where we compute 𝑃 (𝑆1), 𝑃 (𝑆2) and 𝑃 (𝑆1 ∩ 𝑆2). We use the formula above, and get

27
Q

Draw two cards from a deck. Define the events: 𝑆1 = β€˜first card is a spade’ and 𝑆2 = β€˜second card is a spade’.

What is 𝑃(𝑆1βˆ©π‘†2)?

A

We compute 𝑃 (𝑆1 ∩ 𝑆2) by counting:

Number of ways to draw a spade followed by a second spade: 13 β‹… 12.

Number of ways to draw any card followed by any other card: 52 β‹… 51.

28
Q

Draw two cards from a deck. Define the events: 𝑆1 = β€˜first card is a spade’ and 𝑆2 = β€˜second card is a spade’. What is the 𝑃 (𝑆2)?

A

Since 13 of the 52 cards are spades we get 𝑃 (𝑆2) = 13/52 = 1/4.

The probability 𝑃 (𝑆2) = 1/4 may seem surprising since the value of first card
certainly affects the probabilities for the second card.

However, if we look at all possible two card sequences we will see that every card in the deck has equal probability of being the second card. Since 13 of the 52 cards are spades we get 𝑃 (𝑆2) = 13/52 = 1/4.

Another way to say this is: if we are not given value of the first card then we have to consider all possibilities for the second card.

29
Q

Draw two cards from a deck. Define the events: 𝑆1 = β€˜first card is a spade’ and 𝑆2 = β€˜second card is a spade’. What is the 𝑃 (𝑆1)?

A

We know that 𝑃 (𝑆1) = 1/4 because there are 52 equally probable ways to draw the first card and 13 of them are spades

30
Q

when does the law of total probability hold?

A

The law holds if we divide Ξ© into any number of events, so long as they are disjoint and cover all of Ξ©. Such a division is often called a partition of Ξ©.

31
Q

An urn contains 5 red balls and 2 green balls. A ball is drawn. If it’s green a red ball is added to the urn and if it’s red a green ball is added to the urn. (The original ball is not returned to the urn.) Then a second ball is drawn.

What does the tree diagram look like?

A
32
Q

What does independence mean for conditional probability?

A

𝐴 is independent of 𝐡 if 𝑃(𝐴|𝐡) = 𝑃(𝐴).

33
Q

base rate fallacy tree

A