Probability & Normal Curves Flashcards
for any random experiment, we need to define
- a list of all possible outcomes of the experiment
- a probability for each outcome
sample space
a sample space (or state space), S, of an experiment is the set of all possible experimental outcomes
event
an event is any collection (subset) of outcomes contained in the sample space, S
venn diagrams
sample spaces and events are often represented by Venn diagrams
- each dot represents a single outcome
- A and B consist of more than one outcome
rules of probability
rules of probability:
- the probability P(A) of any event A satisfies 0 ≤ P(A) ≤ 1
- probabilities are always expressed as values from 0-1
P(A) = 0: the event A will never happen
P(A) = 1: the event A will always happen
- if S is the sample space in a probability model, P(S) = 1
- since some outcome must occur on every trial
complement of an event
- an event is just a set, so relationships and results from elementary set theory can be used to study events
- the complementary of an event A is the set of all outcomes in S that are not contained in A (i.e., A doesnt occur)
- the probabilities of A and not A add to 1
P(A) + P(not A) = 1
- in many cases, it’s easier to find the probability of the complement of an event and use the above rule to fine the probability of the event
example: a loaded (ex, biased) coin lands on “heads” 4 times out of 10. what is the probability of landing on “tails” for this coin?
mutually exclusive or disjoint
- two events A and B are said to be mutually exclusive or disjoint if they have no outcomes in common
- that is, A and B do not intersect in Venn diagram
- by definition, A and not A are mutually exclusive
example: A = {1, 3, 5, 7, 9} B = {2, 4, 6, 8}
- mutually exclusive events cannot happen in the same outcome
- if A and B are mutually excluse events P(A or B) = P(A) + P(B)
example: the graph summarizes the probabilities of outcomes of a roll of two dice
what is the probability of a sum greater than 9 when rolling two dice?
P(>9) = P(10) + P(11) + P(12)
= 0.0833 + 0.0556 + 0.0278
independent & dependent events
- two events A and B are independent if the occurrence of one has no effect on the ocurrence of the other event
- in other words, the probability of event B occurring is not affected by whether or not event A has occurred or
the probability of event B occurring is not affected by whether or not event A has occurred
- otherwise, the events are dependent
- if A and B are independent, the probability of A and B occurring is the product of their individual probabilities P(A and B) = P(A) x P(B)
imagine rolling two dice and flip a fair coin: are flipping “heads” and rolling a sum greater than 9 independent?
the dice roll and coin flip are independent events
the probability of flipping heads is
P(A and B) = P(A) x P(B)
= 0.5 x 0.1667
if one event is containede inside another event
- if one event is contained inside another event, then the ‘subset’ cannot have a higher probability than the encompassing event
- event A is contained in event B if the ways A can occur are a subset of the ways B can occur P(A) ≤ P(B)
P(world ends in 100 years) vs. P(world ends due to meteorite or nuclear war in 100 years)
marginal distribution
marginal distribution of one of the categorical variables in a two-way table is the distribution of values of that variable among all individuals in the table
- the % for marginal distributions are found by dividing each row total or column total by the table total
conditional distribution
conditional distribution of a variable describes the values of that variable among individuals who have a given value of another variable
- the percents for conditional distributions are found by dividing each row entry or column entry by their total
empirical rules
- if the data is normal (i.e. bell-shaped, symmetrical with a single peak), with mean μ and standard deviation σ, then
1. roughly 68% of the values are within (plus or minus) one standard deviation of the mean [μ − σ, μ + σ]
2. Roughly 95% of the values are within (plus or minus) two standard deviations of the mean [μ − 2σ, μ + 2σ]
3. Roughly 99.7% of the values are within (plus or minus) three standard deviations of the mean [μ − 3σ, μ + 3σ]
student height is normally distributed with mean of 165 cm, and standard deviation of 10 cm
a) what proportion of students have a height between 145 cm and 185 cm?
b) what proportion of students have a height between 145 cm and 175 cm?
solution:
a) z145 = 145-165 / 10 = 2
z185 = 185-165/ 10 = 2
b) z175 = 175-165/ 10 = 1
0. 88 /2 = 0.34 (175 cm)
0. 95/ 2 = 0.475 (145 cm)
proportion -> 0.475 + 0.34 = 0.815