case diagnostics Flashcards
what do regression/model outliers have?
a large residual E^i
E^i
discrepancy between predicted y value (y^i) and observed value (yi)
how to calculate standardised residuals
divide E^i by the estimate of the standard deviation of residuals, and convert the residuals to z-score units (this calculation includes the potential outlier)
computed by rstandard() function in r
how to calculate studentised residuals
divide E^i by the estimate of the standard deviation of the residuals excluding the case i
provides a version of standardised residuals excluding the outlier case
computed by rstudent() function in r
high leverage cases
cases with an unusual value of predictor (xi) or a combination of predictor values
have the potential to influence the B^0 (intercept) or B^1 (slope) of the regression model
can increase x variance
what values are used to assess leverage
hat values
high influence cases
when a case has high leverage and is an outlier - this has a large influence on the estimation of regression models
can have a strong effect on B coefficients - so if we deleted it they would change
-> The degree of change is a way to judge the magnitude of influence
what does cooks distance use for considering influence
combines leverage (hat values) with the outlying-ness to capture the influence
Di = outlying-ness * leverage)
cook’s distance refers to…
the average distance the y^ values will move if a given case is removed
- if removing the case changes the predicted values a lot (moves the regression line), then that case is influencing our results
a single value which summarises the total influence of a case
DFFit
difference between the predicted outcome value for a case with and without a case included
DFbeta
difference between the value for a coefficient with and without a case included
DFbetas
a standardised version of DFbeta
obtained by dividing by an estimate of the standard error of the regression coefficients with the case removed
which diagnostics are used to look at linear models with 2+ predictors in more detail
DFFit, DFbeta, DFBetas
what measures the influence of standard errors
COVRATIO
COVRATIO
measure the effect of an observation on the covariance matrix of the parameter estimates
- an observation’s influence on standard error