Regression diagnostics Flashcards

1
Q

linear regression assumttions

A

mean distribution of error zero, distribution of error constant variance, distribution of error normal, errrors independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Mean of distribution of error is 0

A

The mean of the response is a linear function of x. if look graphically we expect that as x increases the average value of y increases or decreases and does so linearly (not quadratic, logarithmic, exponential – would suggest model not a good one)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Distribution of error has constant variance

A

Just like analysis of variance and t test assumption that variance is the same – the variance of the response variable is same regardless of the value of x. the spread of y shouldn’t change depending on the value of x – should basically be constant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Distribution of error is normal

A

Errors independent: for ever observation (subject/sample in study) the deviation from the regression line is independent from one subject to the next. (error overestimating one person’s weight is independent of error estimating another person’s weight)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

when is errors independent assumption not met

A

When is the 4th assumption not met? When measuring the same stock price from one day to next there will be temporal correlation. If taking BP measurement on one day and same subject the following day those are not independent. Places where time comes in is most common place where assumption suspect.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how to evaluate if assumptions hold

A

estimate errors called residuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

residual formula

A

Observed value of y (11) – fitted line on regression line (10)
can be positive or negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

residual useful for

A

o Diagnostics–techniques for checking assumptions of the regression model
o Understanding the variation in Y that is unexplained by the regression model
o Identifying possible outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

how to look at residuals

A

o Plot residuals vs. Xi Values
o Plot residuals vs predicted values ( )
o Plot histogram or stem-and-leaf of residuals
o –Q-Q plot of residuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

residual plot for functional form

A

bad if residuals form nonlinear pattern like an arch

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

residual plot for equal variance

A

bad if fan shaped

for small values of x variance in individuals is small and as x increases variance gets larger

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

q-q plot

A

used to assess whether errors follow normal distribution

A Q-Q plot graphs the quantiles of the residuals against the expected quantiles for a sample from a normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what should Q-Q plot look like

A

Ideally, a Q-Q plot will be a straight line. Deviations from linearity indicate how the distribution of errors differs from normality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

skewed residuals mean

A

when you deviate far from regression line you will be far above regression line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what to do if plots indicate problem

A

o Add or remove variables
o Transform variables or recode categorical variables
o Remove outliers (but be careful!)
• Use a different analytic approach

How well did you know this?
1
Not at all
2
3
4
5
Perfectly