Model Selection Flashcards
Cochran’s theorem ?
If H_0: all group means are equal, is true
Then
A F can be formed from:
(SSB) group sum of squares, (SSW) within-group sum of squares
F= (SSB/df_between)/(SSW/df_within)
When to transform observations?
If model checking suggests variance is not constant
Commonly used transformations of observations
And 1/y
Box-Cox transformations
This estimates the λ that minimizes sd of standardised transformed variable
First transformations of observation to try?
ln (y)
Important thing to remember when transforming observations
All y_i must be >0
If all other transformations fail, try?
Trig functions, in particular:
Sin^-1 or Tan^-1
F test for deletion of subset of variables:
Extra sum of squares?
Where β_q,…, β_p-1 are the variables being potentially removed
F test for deletion of subset of variables:
How to separate variables in vectors
F test for deletion of subset of variables:
Find SS_extra in vectors
F test for deletion of subset of variables:
Null hypothesis? H_1?
Where β_q,…, β_p-1 are variables to be removed
F test for deletion of subset of variables:
Form F test stat and reject H_0 at α level
When to use all subsets regression
If there is no natural ordering to explanatory variables
Given p-1 expl variables, how many possible models are there?
2^(p-1)
Usual statistics used to compare models?
MS_E
R^2
C_p