MTH2006 STATISTIC MODELLING AND INFERENCE Flashcards
cumulative distribution function (cdf) of a random variable Y
F_Y(y) = Pr(Y < y) where y belongs to the range space of Y
probability mass function (pmf) [if Y is discrete]
f_Y(Y) = Pr(Y = y) and F_Y(y) = x:x
probability density function (pdf) [if Y is continuous]
f_Y(y) = d/dy F_Y(y) and F_Y(y) = integral(y -> −∞)(f_Y(x)) dx
p-quantile of a random variable Y
the value y_p for which Pr(y ≤ y_p) = p
Pr(Y > y)
1 − Pr(Y ≤ y)
joint cumulative distribution function (cdf) of a vector Y1,…Yn
F_Y(y1, . . . , yn) = Pr(Y1 ≤ y1, . . . , Yn ≤ yn)
If Y1, . . . , Yn are discrete then their joint pmf is defined by
f_Y(y1, . . . , yn) = Pr(Y1 = y1, . . . , Yn = yn)
If Y1, . . . , Yn are continuous then their joint pdf is defined by
fY (y1, . . . , yn) = ∂^n/∂y_1 . . . ∂y_n F_Y (y1, . . . , yn)
Y1, . . . , Yn are independent if
f_Y (y1, . . . , yn) = f_Y1
(y1). . . fY_n(yn) for all y1,…,yn
Y1, . . . , Yn are identically distributed if
f_Y1(y) = . . . = f_Yn(y) for all y1,…yn
if Y1, . . . , Yn are independent and identically distributed (iid) then their joint pdf or pmf is
f_Y(y1, . . . , yn) = f_Y1(y1). . . f_Y1(yn)
explanatory variable
plotted on the x-axis and is the variable manipulated by the researcher
response variable
plotted on the y-axis and depends on the other variables
if Y has a poisson distribution with parameter µ, then we write Y~Poi(µ) and Y has pmf
f_Y(y) = µ^ye^-µ / y! for y = 0,1,2…
if Y has a exponential distribution with parameter θ, then we write Y~Exp(θ) and Y has cdf
F_Y(y; θ) = 1 - e^-θy for y > 0
if Y has a exponential distribution with parameter θ, then we write Y~Exp(θ) and Y has pdf
f_y(y; θ) = d/dy F_Y(y; θ) = θe^-θy
for p-quantile cdf is
F_Y(y_p) = p and y_p = F_Y^-1(p)
expectation is
E(g(Y)) = sum(Pr(Y=x)g(x) = sum(F(x)g(X) where F(x) is the pmf
variance of random variable Y is
Var(Y) = E(Y - E(Y)^2) = E(Y^2) - E(Y)^2
empirical probability r/n is
r/n = Pr(X ≤ x_r)
simple linear model means
one explanatory variable
an example of a joint distribution for two variables is the…
bivariate normal distribution
if X and Y are independent
f(x,y) = f_x(x)f_y(y)
covariance formula:
Cov(X, Y) = E[(X - E(X))(Y - E(Y))] = E(XY) = E(X)E(Y)
if independent, covariance formula:
Cov(X, Y) = 0
E(XY) = E(X)E(Y)
covariance with correlation/variance formula:
Cov(p*sqrt[Var(X)Var(Y)])
an example of a joint distribution for two variables, each with a normal distribution is called
the bivariate normal distribution
the joint pdf of the bivariate normal distribution
f(x, y; θ) = 1 / 2πσXσY sqrt(1 − ρ^2) * exp(−1/2(1 − ρ^2)[(x − µX)^2/σX^2 + (y − µY)^2/σY^2 − 2ρ(x − µX)(y − µY )/σXσY
a continuous random variable Y defined on
(−∞,∞) with pdf f(y; θ) has expectation denoted by E(Y) and defined as..
E(Y ) = integral(∞ -> −∞) y * f(y;θ) dy
a discrete random variable with range space R and pmf f(y; θ), E(Y) is defined as…
E(Y ) = sum(y∈R) [y * f(y; θ)]
for a real valued function g(Y), when continuous, E[g(Y)] is
integral(∞ -> −∞) g(y) * f(y; θ)dy
for a real valued function g(Y), when discrete, E[g(Y)] is
sum(y ∈ R) g(y) * f(y; θ)
α-confidence interval is …
an interval
estimator that contains the true parameter value θ with probability α for every θ
null hypothesis
H_0 : θ = x (this is also a simple hypothesis = completely species a probability model by specifying a specific value)