Regression analyses Flashcards
Regression methods
Study the functional relationships between the response variable/outcome and the explanatory variables.
Variables & outcome linked by some mechanism (e.g., theory, causal pathway, etc.), otherwise meaningless.
Simple regression model
y = β0 + β1x1 + ε
β0 = model intercept (mean of y when x=0) β1 = slope relating variable x to y x = predictor ε = residual term
Assumptions of OLS
1) Linear relationship between predictor & outcome
2) Error terms have constant variance
3) Error terms are independent of one another
4) Error term normally distributed
Linearity
Scatterplot
Lowess (Locally weighted scatterplot smoother) - fits smooth line to data & if outcome linear, close to straight line
Constant error variance
Plot squared residuals vs predicted variable; if see trend, non-constant error variance or heteroskedasticity
Bruesch-Pagan statistical test
Normality of error term
Q-Q plot
Shapiro-Wilk test
If errors not normally distributed, consider transformation of response variable
Multiple regression analysis
y = β0 + β1x1 + β2x2 + … + βkxk + ε
Same assumptions about error terms.
Linearity (unless specify nonlinear functional form)
Predictors in model as additive terms
Standardized regression coefficients
1) Standardize data prior to analysis
2) Apply standardization factor after estimation of coefficients
Methods:
1) z-scoring variables: (xi - xbar) / sx
2) Multiply estimated coefficient by ratio of standard deviations of Y & X: b1 = (sy / sx1) * beta1
Multicollinearity
When predictors correlated with each other (generally correlations >.4).
Variance inflation factor (generally, >=10)
Outliers
DFFITS: Measure of influence of each observation on its predicted value
DFBETAS: Measures influence of observation on value of regression coefficient
Cook’s D(istance): Measures influence of each observation on all fitted values
Interactions between predictors
Effect of one predictor being dependent on the value of another predictor.
If significant interaction, can keep model as specified or run stratified analyses.