Stepwise Multiple Regression Flashcards
casewise diagnostics
casewise tables show cases where stadardised residuals size exceeds +/-3.3
what is stepwise MR?
adds multiple variables while removing those that don’t improve R2 value
SPSS selects and orders predictors
variables added to regression table one at a time to maximise R2 (p<0.05)
aim of stepwise MR
aim to create best model fit and achieve highest R2
when is stepwise MR useful?
with lots of predictor to get best model
why is stepwise MR good?
Automatic variable selection is time-efficient.
Quick identification of significant predictors.
Used for exploratory data analysis.
why is stepwise MR bad?
Overfitting for that data but not for new data (can’t generalise well).
Can result in models highly dependent on particular sample.
Biased estimates and incorrect conclusions about correlations as doesn’t take into account interdependence of variables.
steps for analysis
check assumptions are met
assess model overall
evaluate predictors
formally report results
present regression in table
check assumptions are met step
multicollinearity - tolerance>0.5, VIF<10
normality - histogram and p-plot
linearity - p-plot
homoscedasticity and outliers - scatterplot
outliers - residual stats, std. residual +/-3.3, cooks D<1 (in casewise diagnostics table)
assess model overall step
variables entered/removed (shows order)
excluded variables table (shows what is not in it)
model summary table (adj. R2 = % variance in model)
significant improvement (change in R2) between models (p<0.05) - model summary table
ANOVA table (F equation)
evaluate predictor variables step
coefficients table (std. coefficients beta and significance columns always show significance of predictors)
std. coefficients beta (no. of SD the outcome changes when predictor changes by 1SD)
regression equations for final model used unstandardised coefficients
formally reporting the results
A stepwise MR was used to identify the best predictive model of (criterion) from the predictors (name).
Preliminary analyses were conducted to ensure no violation of linearity, normality, multicollinearity, homoscedasticity and sample size. (comment on whether they were met. comment on outliers - was/wasn’t omitted from final analyses)
A final model was identified where (predictor) (p<0.001) and predictor (P = ) explained (adj R2)% of the variance in (criterion) (adj R2 = ). (F equation) (comment on the other significant/non-significant variables). The contribution of the predictors in the final model are shown in table 1 (draw table).
Then interpret the results in words.
problems with stepwise entry
computer problems
no variables might be in final model as they don’t fit the rules
lack of researcher control