Gaussian linear models (GLMs) Flashcards
Assumptions of GLMs
- Linearity: E[Yi|x1i, … xki] = μi = β0 + β1x1i + βkxki ∀ i
- Homoschedasticity: Var(Yi|x1i, … xki) = σ2 ∀ i
- Conditional (linear) uncorrelation: cov(Yi|x1i, … xki, Yh|x1h, … xkh = 0 ∀ i ≠ h
- Normality: Yi|x1i, … xki ~N(μi, σ2 which, together with uncorrelation, makes for the conditional independence assumption.
GLM definition
Y | X ~ Nn (Xβ, σ2In)
Or, alternatively: Yi = β0 = β1x1i + … βkxki + εi, with εi|x1i, … xki ~ Nn(0, σ2)
Fitting GLMs in R
lm(y~x1+x2) #Based on OLS
glm(y~x1+x2, family=gaussian)
- Std. Error = [σ2^ (X’X)-1]1/2jj
- Residual standard error = sqrt(σ2^) not MLE
- glm shows Null deviance (model only with β0, n-1) and Residual deviance (n-p)
Linearity assumption checking in R
plot(fit, which=1)
- Residuals VS Fitted
- Red line (local average) straight on 0: no systematic difference between fitted values and observed ones if we focus on units sharing about the same covariance pattern
Normality assumption checking in R
plot(fit, which=2)
- QQ-Plot
- Points on the bisetrix: same distribution between theoretical and empirical quantiles of the standardized residuals
Homoschedasticity assumption checking in R
plot(fit, which=3)
- Scale-Location plot
- Red line (local average) straight on 1: the variability pattern does not change