Multiple Regression Flashcards
multiple regression
allows us to asses the influence of several predictor variables (IVs)(refer to as X now) on the outcome variable (DV)(refer to as Y now)
multiple regression model
- instead of a line (simple) its a plane of best fit
- still evaluated by goodness of fit
- y/ outcome variable on y-axis
multiple regression assumptions
- sufficient sample size
- linearity
- absence of outliers
- multicollinearity
- normality, linearity, homoscedasticity, independence of residuals
NO PARAMETRIC ALTERNATIVEs
what happens if we have too few participants in multiple regression
over-optimistic results - results may not be generalisable
sample size required for combined effect of several predictors
N > 50 + 8M
M = number of predictors
sample size required to look at separate effect of several predictors
N > 104 + M
multicollinearity
- ideally predictor variables will be correlated with the outcome variable but not with one another
- check correlation matrix
- highly correlated variables (r>.9) show multicollinearity, potentially combine these variables or remove one
note for multiple regression correlation matrix
- only report r for IVs in table
- you find this in ‘Correlations’ table
- column labeled 1 is for correlations for each variable independently, 2 is for the variables combined correlation (are they correlated together - yes but not strongly so not worried about multiple linearity)
SPSS output for multiple regression
in model summary table
- information of relationship- R square is variance in sample
- R-square adjusted is in population
R^2
proportion of variance explained by the model (predictor variables combined) (in SAMPLE)
- expressed as a percentage or decimal in write up, from model summary table
Evaluating the model (assessing goodness of fit)
ANOVAa table
- for reporting F statistic
e.g. regression model was significant, F(2,197) = 67.91, p < .001
F-ratio and null
F-ratio tells us how much model has improved the prediction of y relative to the inaccuracy of the model
H0: regression model and simplest model are equal (value of all slopes (b) is 0
- significant value means regression model is better and explains more variance than simplest model (at least one slope is not 0)
Coefficients
in coefficients table
negative sign = negative association between DV and IV (vice versa)
- standardized regression coefficients for which is the stronger predictor
what standardized coefficients mean: e.g. as age increases by 1 SD, Christmas joy increases by 0.49 SD
multiple regression equation
plug in numbers to regression equation to predict Y
- be careful of negative numbers
e.g. 40 + (-50) = 40 - 50
coefficients table for t-values
- how much each individual predictor, separately, improves prediction of y
H0 for t-test: the predictor and simplest model are equal
if significant, predictor provides better fit than simple model
- also provides CIs for the slopes (b)
write up for multiple regression
- no design
- open with results
- ‘multiple regression was used to investigate if DV could be predicted by IV1 and IV2. Descriptive statistics are presented in Table 1, alongside correlations between each study variable. Preliminary analyses revealed no violation of normality, linearity, multicollinearity or homoscedasticity assumptions.’
TABLE FOR MULTIPLE REGRESSION - practice draw
- M and SD and CIs for first two columns from descriptive stats at the top of output
- 1 are correlation values from ‘Correlations’ table for each IV and the DV separatley, 2 is for the correlation of both IVs, CIs from HELP
results write up multiple regression general
regression model was/was not significant F(d.f.) = _, p < _. When considered together, IV1 and IV2 explained X% of variance in DV (find % by converting from decimal in model summary table - R square).
results write up multiple regression coefficents
- ONLY IF OVERALL REGRESSION MODEL WAS SIGNIFICANT
- Regression coefficients were reviewed to consider the influence of the separate predictor variables on DV. They revealed a positive/negative association between IV1 and DV , b = _ [CIs,_], t = _, p < _ (in coefficients table)(t-stat only if regression significant)(direction only if significant). Same for IV2. Standardised regression coefficients indicate that IVX is a stronger predictor than IVZ (other IV)(same table)(direction only if significant)(only if more than one predictor was significant when considered alone).
discussion multiple regression
Our research provides some clear evidence that IV1, and IV2 influence DV. Wihle DV appears to increase/decrease as increase/decrease in IVX, a higher/lower IVX was associated with higher/lower DV.