final Flashcards
What happens if residuals are not normally distributed?
SE’s are incorrect
what happens if residuals are heteroskedastic?
Heteroskedasticity has no bearing on coefficient estimates but can produce incorrect (downwardly biased) standard errors and this lead to incorrect confidence intervals and increased Type I errors in hypothesis tests (rejecting the null when the null is true).
what is a strategy to remedy heteroskedasticity?
robust standard errors
what is a strategy to remedy non-normality in residuals
If the residuals are not normally distributed a transformation of the dependent variable might remedy the problem (for example, taking the natural log
what happens if the distribution of a coefficient is non-linear?
Violations of linearity lead to bias in coefficient estimates. If the true relationship between x and y is curvilinear then characterizing it as linear leads to erroneous conclusions about the relationship.
how do you remedy non-linearity?
To remedy nonlinearities one can respecify the model as a polynomial, spline or some other nonlinear transformation of x.
What is spline?
t is a non-parametric regression technique and can be seen as an extension of linear models that automatically models nonlinearities and interactions between variables.
Cook’s D is a measure of overall _____
influence
what is a highly leveraged observation?
observation with x value over about 3 standard deviations from mean of x
what is an outlier?
Observation with y value over about 3 standard deviations from mean of y
In regression, observation with y - yhat over about 3 standard deviations from mean of y
What do hat values measure?
leverage: how much x’s residual contributes to overall TSS
xi-xbar / TSS
High leverage associated with _____ error variance
small
these observations pull the regression line closer to them
what happens when there is measurement error in y?
b will be the same, but standard error will grow, and r squared will shrink
standard error of the model grows
when there’s measurement error in x?
r squared shrinks because there is poor model fit
SEb is smaller because the variance in x (demoniator of SE equation) is artifically large
b will attentuate (attenuation bias!)
This is only true for bivariate!!!!!!
What do Cook’s D and Dfits measure?
both measure overall influence of a data point (leverage AND outlyingness)
Dfits takes E* x sqrt(h/1-h)
Cook’s D: Dfits^2/(k+1) basically standardizes dfits on the number of slopes
both measure how much a given observation affects b
In layman’s terms, what is DFBETA?
difference between what we see with a given observation included, and what we would see if we exclude that observation
what is the difference between Cook’s D/dfits and dfbeta?
Cook’s D/dfits measure the overall influence of an observation, while dfbeta measures the influence on an individual coefficient
If residuals are NOT normally distributed, is there an effect on b?
nope!
what plot do you use to diagnose linearity?
component plus residual plot
what plot do you use to diagnose heteroskedacity
RBf plot