Linear Model Evaluation/Diagnostics Flashcards

Question 1

Q

What is the R^2 statistic?

Answer

A

The proportion of variance explained by the model

Question 2

Q

Why is R^2 not always the best indicator of predictive power?

Answer

A

An overfitted model will have great R^2 statistic but poor predictive power
High variance, then low R^2 score even if correct model

Question 3

Q

What are the assumptions we make about the errors in a model?

Answer

A

well described by a normal distribution
have constant variance
are independent of each other

Question 4

Q

What do we expect to see when remove the signal from the model?

Answer

A

Residuals that are normally distributed

Question 5

Q

What are the two qualitative ways to assess normality?

Answer

A

look at histogram of the residuals

- a QQ Norm plot of the residuals

Question 6

Q

What are the two quantitative ways to assess normality?

Answer

A

Wilk Shapiro test for Normality

- Kolmogorov Smirnov test for Normality

Question 7

Q

Describe QQ Norm plots

Answer

A

Plot the quantiles of two sets of data against each other
If there shapes are similar and roughly normally distributed, tend to get a straight line
plots the residuals sorted in order, against the standardised quantiles for the distribution of interest

Question 8

Q

What is the ith point of QQ Norm plots typically given by?

Question 9

Q

Describe the Shapiro Wilks Test

Answer

A

Produces a statistic which relates to the straightness of the QQ plot
Null hypothesis, H0: data are normally distributed

Question 10

Q

What will happen if the assumption that the errors are independent is violated?

Answer

A

Standard errors and p values are systematically too small and risk drawing the wrong conclusions about model covariates

Question 11

Q

What can the null hypothesis of uncorrelated errors be formally tested by?

Answer

A

Durbin Watson test

Question 12

Q

What can independence also be violated by?

Answer

A

Philosophical ways, like pseudoreplication

Question 13

Q

What is the practical consequence of falsely assuming independence?

Answer

A

Can conclude that one or more unrelated variables are genuinely related to the response

Question 14

Q

What can we do if we have correlation in the residuals?

Answer

A

Ignore the correlation in residuals
Try to remove the correlation in model residuals by sub-setting the data
Account for the correlation using, a generalized least squares model

Question 15

Q

What do we use partial residual plots for?

Answer

A

To address if non-linearity of predictors is caused by either predictor or another unknown one

Question 16

Q

What do partial residual plots show?

Answer

Study These Flashcards

A

Residuals and relationships between y and individual x with adjustments for other x’s

Question 17

Q

What are the useful diagnostic properties of partial residuals?

Answer

Study These Flashcards

A

The slope of the line is the regression coefficent
The extent of the scatter tells us about the support for the function
We can identify large residuals
Curved plots signal non-linear relationships

Question 18

Q

What do we do if we have error distribution shape problems?

Answer

Study These Flashcards

A

Try transforming things to address the distributional shape problems
Move to other models like generalised linear models
Bootstrap your way to glory

Question 19

Q

What do we do if we have independence problems?

Answer

Study These Flashcards

A

Move to other models/methods like mixed models (LMM, GLMM) or used generalised estimating equations

Question 20

Q

What do we do if we have signal problems?

Answer

Study These Flashcards

A

Use complex linear models or generalised additive models (GAM)

Question 21

Q

How do we bootstrap?

Answer

Study These Flashcards

A

We make a new dataset of the same dimension, by sampling the rows of the data with replacement
We do this a lot and fit models at each stage
This shows how roughly things might change if we were to have another sample of data

Linear Model Evaluation/Diagnostics Flashcards

(21 cards)