4. F-Tests and Standardisation Flashcards

1
Q

How do we test significance of overall model?

A

F-test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is an F-Test?

A

Process of testing the statistical significance of the test stat (F-ratio)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is not sufficient to test the significance of the overall model?

A

Testing individual predictors and R2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the F-ratio?

A

Ratio of explained variance to unexplained variance

f = (SS model/ Df model) / (SS residual/ Df residual)

so

f = MS model / MS Residual

Tests the null hypothesis that all the regression slopes in a model are all zero so that our predictors tell us nothing about our outcome/don’t explain variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are mean squares?

A

Mean squares are sums of squares calculations divided by the associated degrees of freedom.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How is the null hypothesis explained in terms of F-test?

A

The null hypothesis for the model says that the best guess of any individuals y value is the mean of y plus error.

Or, that the x variables carry no information collectively about y.

i.e. the slopes all = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What would the different results from a F-Test demonstrate?

A

Big F-ratio = ^ Model significance as more model variance than residual variance involved

F > 1 = ^ More model than residual

F = close to one if null is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you test if your F-test is significant?

A
  1. Select alpha level
  2. Calculate critical value of F
  3. Compare value to critical value
  4. F-ratio is evaluated against an F-distribution with df model and df residual & pre-defined alpha

If it is more extreme than critical value then we reject the null

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is an f-distribution?

A

Test for equality of variances from two normal populations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are degrees of freedom?

A

Number of independent values associated with the different calculations

Df are typically the combination of sample size and the number of things you need to calculate/estimate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the three different types of degrees of freedom? (Name only)

A

Residual DF

Total DF

Model DF

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are residual degrees of freedom?

A

Remaining dimensions that you could use to generate a new data set, that looks like current data set

n-k-1

SS residual calculation is based on our model, in which we estimate k β terms (-k) and an intercept (-1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are total DF?

A

SS total calculation based on observed yi & mean of y

n-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is model DF?

A

Number of parameters in model that are estimated from data = K

SS model is dependent on the slope (beta)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are unstandardized coefficients?

A

When the coefficients are in the same units they are as when the data was collected

It is useful when the units are meaningful

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are standardized coefficients?

A

Coefficients that have been z-scored (dividing individual deviations by mean deviations)

The interpretation of the coefficients becomes the increase in y in standard deviation units for every standard deviation increase in x

17
Q

Why is standardization useful?

A

Useful for comparison if variables are on different scales

Useful if scales are arbitrary

18
Q

What happens to R2, F-test, T-test and B0 when coefficients are standardized?

A

R2, F test and T test stay the same

B0 = Zero when standardised

19
Q

Why should we be cautious in using standardization?

A

Just because you can put regression coefficients on a common metric doesn’t mean they can be meaningfully compared.

The SD is a poor measure of spread for skewed distributions, therefore, be cautious of their use with skewed variables

20
Q

What does standardization do to the correlation?

A

Standardized slope ( ^β∗1) = correlation coefficient (r) for a linear model with a single continuous predictor.

They are the same:

r is a standardized measure of linear association
^β∗1 is a standardized measure of the linear slope.
Something similar is true for linear models with multiple predictors.

Slopes are equivalent to the part correlation coefficient

21
Q

What is the value of the intercept when continuous variables are standardised?

A

0