Week 9: Multiple regression analysis Flashcards
What variables do we have when doing multiple linear regression?
More than one predictor variable
One response variable
What happens if the data is non-linear?
Trying to fit a straight line through curvy data produces a smaller fit to the data - leading to an underestimate of the relationship
How do we check for non-linearity of data?
Regression - correlation matrix and look at the plots
What things may make it hard to identify trends in the data?
Attenuated range or under dispersion of scores (clustered)
What is homoscedacity?
Means that the error variance should be the same at each level of the predictor variable
Heteroskedasticity tests?
You can ask for normality tests under assumptions
Tests the null of homoscedacity - if they are significant this means that it is violated
What are the assumptions of linear regression (6)?
Use correct variables (interval data) Independence of data Sample size Normality Linearity Homoscedacity
How is the linear regression line calculated?
By minimizing the sum of squared differences between observed and predicted values
How do you test for outliers (2 ways)?
- Basic approach (any residual >3 SDs away from the mean)
2. Cooks distance (a measure of the influence of one case on the model as a whole) - under assumption checks
What cooks value is concerning?
> 1 may be a concern
Multiple regression techniques are more sensitive to…
violations of these assumptions than single regression
Why should we use a correlation matrix to check the data?
Correlations are important, they tell us which IVs are related to the DV but also that some of the IVs are related to each other
What is it called when IVs are correlated with each other?
Colinearity
What does colinearity mean in terms of data?
Means that some of the predictors provide little unique information
How do you run a multiple regression in jamovi?
Regression - linear regression
How do you set up jamovi to run a simultaneous multiple regression?
Under model builder:
Put all IVs in a single box
What are the colinearity statistics?
Tolerance
Variance inflation factor (VIF)
What tolerance values are a problem?
<0.1 are a clear problem
What variance inflation factors are a problem?
The inverse of tolerance >10 are a problem
Simultaneous multiple regression is…
Theory free
What kind of multiple regression is driven by theory?
Hierarchical multiple regression
E.g. you want to know what a certain predictor can add to the prediction of an outcome variable beyond the amount that is already explained by a particular predictor
How do you set up jamovi for a hierarchical multiple regression?
Using model builder - put most important predictors in block one and then the subsequent predictors or those you want to know how much they add to the model in block 2
What does the output from a hierarchical multiple regression tell us?
Will create a change in R2 scores to tell us how much more variation in the outcome is explained by adding the subsequent predictor
As well as a model comparisons box that gives you a delta F equation telling you if the newer model is statistically significant or not
What type of linear regression would we use to determine the simplest possible model?
Stepwise multiple regression