Review of Linear Regression Flashcards
In the equation for a linear regression model, what does each term represent?
- y is the vector of responses
- the x’s are the predictors making up X, the data matrix
- β is the vector of parameters
- ε are the independently, identically distributed random errors ~ N(0,1)
Outline the concept of a simple model
y_i = β_o + β_1 x_i + ε_i
If the model is linear then we have that µ = E [y] = x’β is a linear combination of the predictors
What’s our MLE for estimating our parameters?
We seek to minimize least squares and minimise ∑ ( y_i - x_i’ β)^2 with respect to β
Working through, we end up with ^β = (X’X)^-1 X’ y then as ^y = x’β
^β ~ N ( β, σ^2( X’X)^-1)
Define residuals
e_i = ^y - y
This is the difference between the observed and fitted values. A linear model is valid if all our assumptions are valid, which involves testing residuals which should be iid and ~ N(0,1)
What does it mean for a pdf/pmf to be in from the exponential family?
f (y ; θ, ϕ) = exp { (yθ - b(θ)) / a(ϕ) + c (y, ϕ) }
Where θ, ϕ are parameters, the natural and dispersion respectively and a,b and c are functions.
State what is meant by Fisher Info
For a pdf / pmf it is i (θ) = b’‘(θ) / a(ϕ ) and it the amount of information about the parameters the data matrix X carries around θ. Formally, it is the variance of the score of expected of the observed
What’s the point of Fisher Scoring?
We aim to fit a parametric distribution, to data and find ^θ = θ. To maximize the chance of this happening we use U(θ) = ∂/∂θ l( θ; y) = 0 = n ( y - b’(θ)) / a(ϕ) if in exponential family
There are not always analytical solutions to these - in which case we use the Newton Raphson method.
Outline the Fisher Expected Info Matrix
I (θ) = - ( E [ ∂^2/∂θ_i∂θ_j l( θ; y) ]
This is the the likelihood equation differentiated twice with respect to i and j
State the Fisher Scoring Algorithm. When does it equal a different Algorithm?
The FSA is θ^(k+1) = θ^(k) I(θ^(k))^-1 U(θ^(k))
FSA = NR with θ^(k+1) = θ^(k) - b’ (θ^(k) - y) / b’’ (θ^(k))