Ch1 Probability Fundamentals Flashcards
Discrete probability characterized by:
pmf = p(x)
pmf = p(x) satisfies :
Kolmogorov Axioms
- p(x) = Pr(X=x)
- p(x) ge 0 for all x
- sum (p(x)) =1 over all x
cdf =
cumulative distribution function
F(x) = Pr(X le x)
F(x) = sum(p(X le x))
E[X]=
sum(x*p(x)) over all x (continuous case use integrals)
E[X]= words
mean X, long run average, 1st absolute moment
Var(X)=
= E[(X - E[X] )^2) ]
= E[X^2] - E[X]^2
Var(X)= (computational form)
E[X^2] - E[X]^2












rth absolute moment
E[Xr]
rth central moment
E[(X-µ)r]
Var(X)=
σ2 = E[(X-µ)2]




marginal pmf

R.V.’s are independent iff



Cov(X,Y)=







Corr(X,Y)=

Bernoulli distribution

Binomial distribution

Poisson distribution

Normal distribution





CLT

normal approximation to binomial

continuity correction for continuous approx to discrete distribution
eg. normal approx to binomial

chi-square

t-distribution



Ô is an unbiased estimator of O iff
E[Ô] = O
Ô is a weakly consistent estimator of O iff
for any small positive constant, €,
Pr( |Ô - O| < € ) → 1 as n→inf
also called convergence in probability
Ô→P O
IF B(Ô) → 0 and Var(Ô)→0 as n→inf
then Ô is a consitent estimator of O
MSE(Ô) =
E[( Ô - O )2] = Var(Ô) + Bias2(Ô)
Relative efficiency
R.E. = RE(S1, S2) = MSE(S2) / MSE(S1)
RE(X, Y) < 1
X is less efficient than Y
Markov’s inequality
Pr( X > a) le E[X]/a
Chebyshev’s inequality (words)
is the theorem most often used in stats. It states that no more than 1/k2 of a distribution’s values are more than “k” standard deviations away from the mean
Pr(|X-A|=>KY)
Pr(|X-A|=>KY)<=1/K2,
The absolute value of the difference of X minus A is greater than or equal to the K times Y has the probability of less than or equal to one divided by K squared.
Slutsky’s theorem
IF Ô1 →P O1
and
Ô2 →P O2
Then sum and product also converge
E[S2]=
Var[S2]=
remember (n-1)/σ2 * S2 = chi-squared (df = n-1)
E[S2] = σ2
Var(S2) = σ4 / (n-1)2