Module 2 Flashcards
What is correlation?
Correlation is when a change in one variable is associated with change in another variable
What is multiple regression?
Multiple regression is about trying to predict scores based on what we know about other (predictor) variables
What type of questions do moderation analyses attempt to answer?
What/when questions
eg: “under what circumstances?”, “for what type of people?”, “when does the effect occur?”
What type of questions do mediation analyses attempt to answer?
How/why questions
eg: “How does X influence Y”, “Why does X influence Y?”
Provide 3 examples of moderation questions.
“Is the relationship between attitude towards university and where a student sits influenced by the age of the student?” - age is an inherent aspect of the individual
“Does playing violent video games for more than 6 hours a week make people more aggressive?” - 6 hours a week is a circumstance
“If employee satisfaction is high, is job turnover reduced for both male and female employees?”
Provide 3 examples of mediation questions.
“Is the relationship between attitude to university and where students sit explained by IQ?”
“Does playing violent video games that involve realistic interpersonal violence make people more aggressive?”
“If employee satisfaction is high amongst workers with high autonomy, is job turnover reduced?”
Define moderation
Moderation looks at how a third variable changes the relationship between a predictor variable and outcome variable, based on the interaction between the predictor variable and third (moderator) variable.
You can conceptualise it as a multiple regression with 3 independent variables: the original predictor variable, the moderator variable and their interaction.
Why do we need to centre variables?
To avoid multicollinearity
What should we do after we find we have a significant interaction (moderation) effect?
Perform a simple slopes analysis to find exactly where the interaction is occurring
What is the role of a mediator variable?
A mediator variable explains part or all of the relationship between two variables.
What is the role of a mediator variable?
A mediator variable explains part or all of the relationship between two variables.
In mediation, what are the 2 antecedent variables? What are the 2 consequent variables?
Antecedent = X and M Consequent = M and Y
Distinguish between direct effect, total effect, direct pathway and indirect pathway
Direct effect = c’ (X > M > Y)
Total effect = c (X > Y)
Direct pathway = X > Y
Indirect pathway = X > M > Y
How will c relate to c’ if partial mediation has occurred? How will they relate if perfect mediation has occurred?
Partial mediation - c’ will be smaller than c
Perfect mediation - c’ = 0
What are the 4 requirements to that need to be fulfilled to confirm mediation has occurred? Which of them is still debated? Why?
1) There is a significant relationship between X and Y
2) There is a significant relationship between X and M
3) M still predicts Y after controlling for X
4) The strength of relationship between X and Y is reduced when M is in the equation
1) is still debated because alone it represents a correlation and in the same way that correlation doesn’t equal causation, lack of correlation doesn’t mean there is no causation. Thus, it shouldn’t be necessary for there to be a relationship between X and Y
What are the 4 requirements to that need to be fulfilled to confirm mediation has occurred? Which of them is still debated? Why?
1) There is a significant relationship between X and Y
2) There is a significant relationship between X and M
3) M still predicts Y after controlling for X
4) The strength of relationship between X and Y is reduced when M is in the equation
1) is still debated because alone it represents a correlation and in the same way that correlation doesn’t equal causation, lack of correlation doesn’t mean there is no causation. Thus, it shouldn’t be necessary for there to be a relationship between X and Y
What are the 5 steps of data screening and assumption testing?
1) Check data entry for accuracy
2) Evaluate missing data
3) Outliers and normality
4) Linearity, homoscedasticity, and independence
5) Multicollinearity and singularity
What are the 2 different views on how big the sample size should be?
Tabachnick and Fidell = 50 + (8 x IV)
Stevens = 15 x IV
What would our skewness be if our distribution was normal?
Less than 1
What are the two plots we inspect for normality and outliers?
Boxplots and histograms
What are the two statistical tests of normality? How do we use them to know if our distribution is normal?
K-S and Shapiro-Wilk tests - if they are non-significant, this is good and we have normality. If they are significant, this is a problem and we don’t have normality.
However, minor breaches in normality can often be overlooked if we have a large enough sample size.
In terms of the output, what are we looking for to determine if we have any univariate outliers?
‘Casewise Diagnostics’
What two statistics do we look at to determine if we have any multivariate outliers?
Mahal.’s distances and Cook’s distance - if maximum Mahal.’s is less than 13.24, then we’re all good. Cook’s needs to be less than 1 and tells us about any overly influential cases.
Which two plots tell us about homoscedasticity?
P-P plots and scatterplots.
P-P: points should be hugging diagonal line
Scatterplot: points should be evenly distributed around 0
Which test tells us about independence of errors? Where is it found? What values are we looking for?
Durbin-Watson test
Found in Model Summary
Anything between 1-3 is acceptable, with 2 being ideal
What about multicollinearity and singularity? Where do we find the values we want? What stats are we looking at? What are acceptable values? What else could we do to test for multicollinearity and singularity?
Go to ‘Coefficients’ table
Tolerance and VIF
We want tolerance less than 1 and VIF less than 10
We can also do a bivariate correlation between the IV and moderator/mediator, look at Pearson’s and make sure it’s not greater than .8/.9
When interpreting the output of PROCESS, where do we look to find out if mediation has occurred?
“Indirect effect of X on Y”
Go across and look at the bootstrapped CI
If the CI doesn’t include 0, then you know significant mediation has occurred.
What are the steps of a write-up?
1) Explain your analysis ie: what relationships were you testing
2) Report results for direct pathway
3) Report results for X-M (a)
4) Report results for M-Y (b) pathway
5) If doing moderation, talk about simple slopes and report which level of the moderator variable (eg: high callous) had the significant effect on the outcome variable
What are the stats you need to report throughout the write-up?
b, SEb, 95% CI, t, p-value