Multiple Regression Flashcards
• demonstrate an understanding of the similarities and differences between Simple and Multiple regression • demonstrate an understanding of the key statistical elements of Forced Entry Multiple Regression • demonstrate an understanding of the key statistical elements of Hierarchical Multiple Regression complete and interpret Multiple Regression analyses on SPSS
what is the basis of multiple regression?
one outcome, multiple predictors
-> multiple variables (predictors) predict one outcome
what is R-squared
- amount of variance explained by the regression/model
- correlation coefficient squared
what is simple linear regression
- built a model to explain the variance using an equation with one predictor
- test how well variability of the scores is explained by the model (R^2)
- significance of F: variance explained significant (not zero)
- B1: slope, B0: intercept (constant)
how is multiple regression similar to simple regression?
- builds a model to explain the variance using linear equation
- test how well the variability of the scores is explained by the model
- R^2: how much of the variance is explained by our model
- significance of F: is the variance explained significant (not zero)
- usually assumptions inc homoscedasticity and normal distributed residuals apply
BUT what is new for multiple regression?
- using an equation with more than one predictor
- examine how much each predictor contributes to predicting the variability of outcome measures (forced entry and hierarchy regression)
- compare different models predicting the same outcome (hierarchical regression) and see which model predicts most of the variance
R^2
tells us the estimate for our sample
-> will naturally overestimate the ‘real’ R^2 (in the population)
Adjusted R^2
estimate for the population (probably more accurate measure -> more likely to be accurate because it takes sample size into account
why is R adjusted?
- adjusted down to allow for the overestimation of R^2
-> better reflection of the ‘real’ R^2
what does the adjustment relate to?
sample size
-> generally the bigger the sample size, the less need for adjustment
should you report R^2 or adjusted?
report both
-> for simple regression as well
what if F ratio?
- we can test if our model accounts for a significant amount of the variance as we did before
- it is the variance predicted by the model with all predictors
In multiple regression, a significant R squared tells us…
- our model accounts for a significant amount of variance in the outcome
-> the ratio of explained to unexplained variance is high
Unlike multiple regression, in simple regression
You know what variable(s) predict the outcome from the R-squared
The return of the B (characterises the relationship of a predictor)
- get individual b’s for each of our predictors
- relate to each other, because the other variables/predictors are taken into consideration as a control
what does B do?
- estimate of contribution while ‘controlling’ for other variables
- have an estimate of how much each variable contributes on it own with other held constant -> similar to partial correlation
- estimate of the individual contribution of each predictor
Multiple Regression
how much variance.. does the overall model with the number of predictors account for
components in multiple regression
- b0
- more than one predictor i.e. b1(x1) [regression coefficient for predictor 1] + b2(x2) [regression coefficient for predictor 2] + bn(xn) [regression coefficient for predictor nth variable]..
what is the issue with normal b’s?
affected by the distributions and type of score
-> can use them in an equation, but you can’t compare them especially if they are different measures and scores
what is the solution to the B issue?
standardised (make beta weighted) -> by turning B into standard deviation
-> standardised score is simply the number of standard deviations from the standardised mean of the scores (above or below)
-> you can compare how much each predictor is contributing to the prediction
-> by standardising b, it allows us to compare the analysis and contribution of each variable to the outcome in terms of standard deviations
what does b1 = 0.594 mean if beta is weighted?
as the predictor increases by one SD, the outcome increases by 0.594 of a standard deviation
-> Slope we can compare across different predictors
-> Beta telling us about the contribution of each individual predictor to the model - and usually they’re quite variable
how can we test whether each predictor is significant from zero or not?
a T-test
what is the output of a multiple regression?
- each variable has an unstandardised (b) and standardised coefficient (beta or β)
- t value derived from b
-> associated p-value tells you if the coefficient estimate is significantly different from zero (tells us whether there’s a significant predictor in it)
what does the unstandardised value allow you to do?
be used within any equation
what does the standardised value allow us to do?
make comparisons across the predictor
what does negative β indicate?
negative relationship (even if it’s not significant)
why can we not trust correlations?
estimate of the two variable relationship without other variables taken into consideration
how to output your interpretations?
- Extraversion b = 1.40, β = .594, t =6.95, p<.001 (extroversion a significant predictor of wellbeing)
- Agreeableness b = -0.48, β = -.,018 t =-.222 p=.83 (agreeableness is not a significant predictor of wellbeing)
AND so on…
* telling us about individual contribution of predictors from the model
what are b and β better at:
estimates of the contributions of individual predictors
why can’t you trust correlations
they are just estimates of two variable relationship without the other variables taken into consideration
-> they are uncontrolled as they were (just an estimate of two variable relationship)
-> always do a regression because correlations won’t give you the answer that you need
We are looking at how IQ, Age and working memory predict reading scores. This means:
We are looking for 3 beta weights from the analysis and the dfs for Ssreg in the overall ANOVA is 3.
Reading Score = 22+(.03)IQ + (.06)Age + (-0.3) WM. What is the predicted reading score for a 19 year old with an IQ of 110 and WM score of 30 = 30
17.4
A beta of -0.58 means what?
as every SD our variable increases, Y decreases by .58 of the SD
What is Hierarchical Regression?
When we need to control for a third/important variable (i..e controlling for age while seeing if personality predicts wellbeing)
How do we conduct Hierarchical Regression
- add the variables into the equation in steps
1. add the control variable in first -> making sure that you control any variance that might be described by this - examine R2 and its significant
2. add the variables we are interested in -> incl the ones you want to control for and run the analysis again.
Giving you 2 models
What are the two models and what are we looking for?
- Age on its own (Model 1)
- Age and Personality Traits (Model 2)
We are interested in the change of predictive power from Model 1 to 2 -> want to see if there’s a change in step 1 or two
-> We’re looking at the predictive power of model 2 and see if significantly better than the predictor power of model 1
How can we compare both models?
using F ratio changes
- enter the first set of variables into the analysis for the first model -> get the R^2 and F ratio telling us how much variance is accounted for by the model
- next batch of variables are added in a second model
-> R^2 and F ratio telling us how much variance is accounted by the model (which includes both sets of predictors) - wan to compare models
* We’re going to make a call on the F ratio and whether there’s a significant change -> i.e. the F change compares the models and tells us there is a significant improvement in variance explained in model 2 (model that explains significantly more variances)
* We can see if more variance is explained in the second model compared to the first model
ANOVA tables for hierarchical regression
- each model has a separate ANOVA table which tells us if the variance is difference from zero for each separate model
- does not compare the model but instead what you report for each individual model (the amount of variance accounted for zero)
An example of a Hierarchical Regression
“A hierarchical regression was carried out with Age in step 1, and mean scores of the different personality scales; Extraversion, Agreeableness, Openness, Conscientiousness, and Neuroticism in step 2. A significant model was found at step 1, F(1,84) = 6.57, p <.05 and explained 7.3% of the variance. The inclusion of the five personality traits significantly increased the amount of variance explained to 52.6% (p<.001) and was a significant model, F(6,79)=14.61, p <.001…”
* Model 1 and 2 are significant
* Age predicts wellbeing
* The age and personality scales (as a group) predicts wellbeing
* Model 2 is significantly accounts for significantly more variance
* Addition of personality improve our ability to explain the variance
what coefficients should you focus on?
- if model 2 sig, focus your interpretation on this
- when you find predictor predicts an outcome in model 1, but stops in model 2, this is something to focus on
what are dummy variables?
- you can introduce categorical variables into regression using dummy coding (0’s and 1’s)
cases where you’d use dummy variables
- traditional gender or experimental conditions
what about the outcome numbers in dummy variables?
outcome numbers have to be continuous but your predictor does not have to be (can be 1’s and 2’s allowing us to code for 2 different things in our analysis -> can get a score which will help you predict an outcome)
If the b/beta value is positive
category coded as ‘1’ is higher in the outcome variable than the category coded as ‘0’. (i.e. men are scoring higher than females)
if the b/beta value is negative
category coded as ‘0’ is higher in the outcome variable (i.e. females are scoring higher than males)
what is another word for multiple regression?
forced entry regression
where can you find whether the F change is significant?
in the ANOVA table
Male coded as 1 and Female coded as 0 in this analysis. A positive coefficient of 0.86, what would this mean?
men are scoring higher on this measure than females
* positive coefficient means males are scoring higher on Wellbeing (but it is not significant)
A score of -.45
would mean that year one students (0) are happier than year 2 students (1)
what are the assumptions with multiple regression?
- Variable Type: Outcome must be continuous (Predictors can be continuous or discrete e.g. dummy variables).
- Non-Zero Variance: Predictors must not have zero variance.
- Independence: All values of the outcome should come from a different person or item.
- Linearity: The relationship we model is, in reality, linear
- Homoscedasticity: For each value of the predictors the variance of the error term should be constant.
- Normally-distributed Errors: The residuals must be normally distributed
What is a type of bias we need to be cautious of?
Mulitcollinearity
Mulitcollinearity
- exists when predictors are highly correlated with each other
-> look for strong-medium correlations
what are some issues with Mulitcollinearity
undermine your findings
* b1 can be unstable (vary across samples)
* difficult to say which predictor is important
* artificially reduces with R^2 -> and number of individual predictors when all correlated together
how can the assumptions of Mulitcollinearity be checked?
collinearity diagnostics
collinearity diagnostics
- VIF is a measure of each predictors relationship with other predictors
- want it to be a low as possible (tells you that your predictors are reasonably independent of one another
- anything close to 10 is an issue / problematic
how to calculate tolerance?
1 divided by the VIF
-> should be above 0.2
which can affect our results
extreme outliers
how can we check for extreme outliers
standardised residuals
issue with very high standardised residuals?
actual score is very different from predicted
how do we check for outliers?
- just look/check for high standardised residuals (looking for min and max)
- only about 5% should be over 2SD (if they are, that means outliers)
- look in the residuals statistics table to see what the minimum and maximum residual are
how many residuals should be over 2SD
5%
how can we measure for outliers
cook’s distance
cook’s distance
- measures of influence of each cause has on the model
-> most if not all cook’s distance should be below 1 (want cook’s distance to be as low as possible) - look in the maximum cook’s distance section in the residual statistics