Exam #2 - Chs. 4.4, 5, 6, 7 Flashcards
simulation
technique used to recreate a random/unpredictable event
–> tactile or virtual
–> goal: measure how often some outcome occurs
probability
long-term proportion of some outcome
–> Law of Large Numbers: as number of repetitions increases, proportion of an outcome approaches its probability
experiment
any situation, that can be repeated, with uncertain results
normal distribution
if a continuous random variable has a relative frequency histogram in the shape of the normal curve
General Multiplication Rule
for any two events E and F, P(E and F) = P(E)*P(F|E)
bell-shaped graphs?
for a fixed p value, as n increases, the distribution of data will become more bell-shaped
–> random variable X is approximately bell-shaped if np(1-p) ≥ 10
combination
an unordered arrangement without replacement
–> number of arrangements of r objects from n objects, where r ≤ n and all n objects are distinct, = (nCr) = (n!)/(r!(n-r)!)
standard deviation of binomial random variable
√(np(1-p))
area under the curve
represents the proportion of the population with a characteristic within that interval
OR
probability of a randomly selected individual having a characteristic within that interval
event (E)
any collection of outcomes
–> can be one or multiple
–> simple event (ei) represents only one outcome
probability model
list of all possible outcomes and their probabilities for any given experiment
probability density function (pdf)
equation used to measure probabilities for continuous random variables
–> total area under curve = 1
–> height of curve at all points ≥ 1
permutation
an ordered arrangement without replacement
–> number of arrangements of r objects from n objects, where r ≤ n and all n objects are distinct, = (nPr) = (n!)/(n-r!)
normal approximation to the binomial pdf
if np(1-p) ≥ 10, the binomial random variable X is approximately normally distributed, with a mean μ = np and standard deviation √(np(1-p))
–> to approximate probabilities, we must correct for continuity (add/subtract 0.5 from x value)
standard deviation for discrete random variable
√(∑[(x-μ)^2P(x)]
= √(∑[x^2P(x)]-μ^2
Classical (Theoretical) Approach
P(E) = (number of possibilities of E)/(number of total possibilities)
–> requires that all outcomes are equally likely
rules of probability
–> for any event E, 0 ≤ P(E) ≤ 1
–> for S = {e1, e2…en}, P(e1) + P(e2) +…+ P(en) = 1
associations between variables
as (x) changes, it becomes clear that (y) changes in some pattern
–> can help adjust predictions once made clear
–> relative frequencies for all categories, given some condition, will be equal if an association does not exist
sample space (S)
collection of all possible outcomes
Complement Rule for Z
area to the right of z = 1 - area to the left
–> can be used to find percentile rank (kth percentile is where k% of area is to the left)
random variable
numerical measure of the outcome of a probability experiment (X)
–> discrete: finite/countable number of values
–> continuous: infinitely many values
impossible probability
P(E) = 0
independence
if occurrence of some event E does not affect the probability of occurrence for some event F
–> disjoint events are not independent - having one occur insinuates the other did not
–> two events E and F are independent if P(E|F) = P(E)
conditional distribution
relative frequency of each value of a response variable for some specific value of explanatory variable
–> can show associations between variables
properties of normal density curve
–> symmetric about the mean μ
–> single peak at x = μ (mean = median = mode
–> inflection points at x = μ-σ and x = μ+σ
–> area under curve = 1
–> area under curve to left of μ = 1/2 = area to right
disjoint events
two events with no outcomes in common
z α (z sub alpha)
z-score such that the area under the curve to the right of z = α (to the left, 1 - α)
subjective probability
probability determined on the basis of personal judgment (can still be valid)
General Addition Rule
for any two events E and F, P(E or F) = P(E) + P(F) - P(E and F)
Multiplication Rule of Counting
if a sequence of choices has p options for the first choice, q options for the second choice, and r options for the third choice - and all options are independent, the number of possibilities = pqr
Complement Rule
if E^c represents all outcomes in S that are not in some event E, P(E^c) = 1-P(E)
inflection points
points where curvature of the graph changes; occur at x = μ-σ and x = μ+σ
mean for discrete random variable
∑[x*P(x)]
–> mean value of n trials of the experiment will approach the overall mean as n increases
binomial probability distribution
discrete probability distribution with two mutually exclusive outcomes (typically, success vs. failure)
–> fixed number of independent trials
–> each trial must have two disjoint or mutually exclusive outcomes
–> probability of success must be fixed for all trials
random process
situation where the outcome of any particular trial is unknown, but proportion of observing that outcome approaches some value as the number of trials increases
end behavior of normal curve
as x approaches positive and negative infinity, the graph approaches - but never reaches - the horizontal axis
–> probabilities computed as 0 are reported as < 0.001; computed as 1, reported as > 0.999
conditional probability
P(F|E) = probability of F occurring, given that E has already occurred
–> if E and F are any two events, P(F|E) = (P(E and F))/(P(E))
Addition Rule for Disjoint Events
if E and F are disjoint or mutually exclusive, P(E or F) = P(E) + P(F)
mutually exclusive events
occurrence of one event prohibits the occurrence of the other
marginal distribution
frequency or relative frequency distribution for the row/column variable
–> removes effect of other variable
mean of binomial random variable
μ = np
binomial random variable
when random variable X represents the number of successes in n trials
–> p = probability of success
–> (1-p) = probability of failure
standard normal distribution
if the normal random variable X as a mean μ and standard deviation σ, it can be standardized to Z = (x-μ)/(σ)
–> Z has μ = 0 and σ = 1
–> table V gives areas under the normal curve for some value of z (z-score)
Empirical (Experimental) Approach
P(E) ~ relative frequency of E
P(E) = (frequency of E)/(number of trials)
binomial probability distribution function (pdf)
probability of x successes in n trials = nCxp^x(1-p)^n-x
Simpson’s Paradox
when association between two variables inverts or disappears completely after a third variable is introduced
certain probability
P(E) = 1
unusual probability
P(E) < 0.05 (< 5%)
Multiplication Rule for Independent Events
if E and F are independent, P(E and F) = P(E)*P(F)
permutations with non-distinct items
if, of n objects, n1 are of some kind; n2 of some kind, and nk of a kth kind, the total number of arrangements = (n!)/([n1!][n2!][nk!]), where n = n1 + n2 + nk
stacked/segmented bar graph
one bar for each value of explanatory variable –> split into proportions corresponding to each response
contingency table
relates two categories of data (row and column variable)