4. F-Tests and Standardisation Flashcards
How do we test significance of overall model?
F-test
What is an F-Test?
Process of testing the statistical significance of the test stat (F-ratio)
What is not sufficient to test the significance of the overall model?
Testing individual predictors and R2
What is the F-ratio?
Ratio of explained variance to unexplained variance
f = (SS model/ Df model) / (SS residual/ Df residual)
so
f = MS model / MS Residual
Tests the null hypothesis that all the regression slopes in a model are all zero so that our predictors tell us nothing about our outcome/don’t explain variance
What are mean squares?
Mean squares are sums of squares calculations divided by the associated degrees of freedom.
How is the null hypothesis explained in terms of F-test?
The null hypothesis for the model says that the best guess of any individuals y value is the mean of y plus error.
Or, that the x variables carry no information collectively about y.
i.e. the slopes all = 0
What would the different results from a F-Test demonstrate?
Big F-ratio = ^ Model significance as more model variance than residual variance involved
F > 1 = ^ More model than residual
F = close to one if null is true
How do you test if your F-test is significant?
- Select alpha level
- Calculate critical value of F
- Compare value to critical value
- F-ratio is evaluated against an F-distribution with df model and df residual & pre-defined alpha
If it is more extreme than critical value then we reject the null
What is an f-distribution?
Test for equality of variances from two normal populations
What are degrees of freedom?
Number of independent values associated with the different calculations
Df are typically the combination of sample size and the number of things you need to calculate/estimate.
What are the three different types of degrees of freedom? (Name only)
Residual DF
Total DF
Model DF
What are residual degrees of freedom?
Remaining dimensions that you could use to generate a new data set, that looks like current data set
n-k-1
SS residual calculation is based on our model, in which we estimate k β terms (-k) and an intercept (-1)
What are total DF?
SS total calculation based on observed yi & mean of y
n-1
What is model DF?
Number of parameters in model that are estimated from data = K
SS model is dependent on the slope (beta)
What are unstandardized coefficients?
When the coefficients are in the same units they are as when the data was collected
It is useful when the units are meaningful