Moderation, Logistic Regression, Mixed ANOVA Flashcards
What kind of relationship do we look for in a moderator
Is the relationship between predictor (X) and outcome (Y) affected by the moderator (M).
What are 4 different changes in relationship with a moderator
1.smaller
2.larger
3.disappears
4.direction changes
What is a main difference between mediation and moderation
Mediation shows how a predictor works,
Moderation shows whether or when it works.
shows us under what conditions we can expect a relationship
What is the difference between moderation mediation and a third variable problem
Third variable problem:
3rd variable
The mediator is predicting both the main predictor and the outcome.
What is some language showing you have a moderation
“Only when” “Sometimes” “It depends”
How does a moderator impact the relationship of the predictor to the outcome
Depending on the level of the moderator, the beta will increase or decrease in strength
Which type of variable is easier to visualize in a moderation analysis
categorical moderator (as opposed to continuous)
What extra step do you need to do to calculate a moderation analysis, as opposed to a regular 3-variable mutliple regression
We need to add an interaction. The model we run includes 3 predictors instead of 2, but the third is just the multiplication of X by M
Which elements are part of the moderation model
- Predictor
- Moderator
- Interaction: Predictor x Moderator
- Outcome
What is an important first step to run a moderation analysis
All predictors need to be centered before conducting your regression.
Centering means subtracting the Mean from each observed value.
Conveniently, standardized variables are automatically centered
The interaction term must be calculated using the centered predictors.
How can you tell if you have a significant moderator, and what should you do next
If the interaction term beta is significant, then we have found moderation.
You will get three betas: 2 predictors + Interaction
Follow it up with a simple slopes analysis
What are the simple slopes plot, how should you interpret it
This plot allows us to visualize the effect.
It will give the average estimate, the low (-1SD) and the high (+1SD) estimate as well as p-values. If the p-values are significant, you can determine which level of X is different form the average.
What are main features of logistic regression
- Outcome we’re interested in is categorical.
- Doesn’t require straight lines
- Can use categorical and continuous predictors
What is a binary logistic regression
The outcome has only two possibilities
What is the goal of a logistic regression
Instead of focusing on the amount of variability in the outcome that is explained, we’re trying to predict which category participants fall into
With binary logistic regression we make a model to predict which of the outcome variable’s two categories each participant falls into, and then check that model against the observed outcome.
What are the minimum and maximum value of logistic regression
Probability ranges between 0 and 1.
You can only have zero if the bottom of the equation reaches infinity, which won’t happen. But in theory, it could reach zero.
What statistics tell us if our logistic regression is significant
Our overall model significance test isn’t an ANOVA, but rather a χ2 test (chi squared); this will tell us whether our R2, or accuracy, is high enough
We also want to know how much each predictor contributed to this model accuracy (beta, or other statistics)
What should you do if a predictor is not significant in your logistic regression model
Remove it.
Just like linear regression, it’s best to use the simplest model you can get away with. That is, without removing any significant predictors.
This means we only keep predictors if they have explanatory benefit
What happens if only one of the categories of a predictor has a significant beta?
if one of the levels of the categories in that predictor is significant, it justifies using the entire predictor. Not all levels have to be significant
What can you use to make sure the predictor you removed was not significant
The hierarchical model teardown works well:
First fit a model with all predictors, then remove any that don’t contribute significantly
In jamovi we need to do this backwards, to get the p –value of the model change (to verify we didn’t remove too much)
Name the 5 assumptions of logistic regression model
Complete Predictors
No Complete Separation
No Overdispersion
Not too much Multicollinearity: continuous predictors only
No Influential Outliers: continuous predictors only
Explain complete predictor
We need data from all categories, for categorical predictors
We need the full range of responses for continuous predictors
Explain complete separation
Complete separation is when the outcome is perfectly predicted. Complete separation makes it impossible to select a single wellfitting model (it’s like not being able to calculate a line of best fit) because there is a horizontal gap between the observations
We need to see some horizontal overlap between the high and low probability observations.
How can you assess if there is complete separation
You can use a Descriptives analysis to determine whether this is a problem.
You need to examine the range (Minimum & Maximum) of scores for each predictor, separately for people in the DV = 0 and DV = 1 categories
Assuming you coded them as 0 and 1, of course
Go to Exploration Descriptives
Add any predictors to the Variables box
Add your outcome variable to the Split by box
Explain overdispersion
The variance is larger than expected from the binomial distribution.
Essentially, there are too many observations in one condition. It’s only relevant when you have more than one predictor
This would be the same as a violation of the assumption of Independence of Errors in MR – questionable model p -values
It’s still very unlikely to be a problem as with a large sample size, it is very unlikely that our significance would change enough to change the results
underdispersion is also possible, but even more unlikely
How can you check for multicollinearity in logistic regression
While we could just do a regular linear regression to check for multicollinearity problems, jamovi also has this option under the Assumption Checks
It does the exact same thing, and gives you VIF/Tolerance
Assuming you are using continuous predictors
How can you check for outliers in logistic regression
For influential outliers we need to do a linear regression, so that we can do Mahalanobis and Cook’s distance checks
We can’t do it 100% correctly (except for Mahalanobis – why?) but it will be approximately right
Which statistics could you report from a logistic regression
Deviance (aka -2LL) : higher values mean better model fit
Model prediction %
Cox and Snell R2
Nagelkerke R2
McFadden’s R2
Odds Ratio - SPSS called this the Exp(B), just FYI
Explain the Odds Ratio
The Odds Ratio is the change in the odds of the higher-numbered outcome occurring given a 1 unit change in the predictor.
It is, mercifully, similar to a Beta in that it can show positive and negative relations – but…
If OR is > 1 then the relation is positive
If OR is < 1 then the relation is negative
The OR can’t go below 0, but can go as high as infinity
there is the same amount of space between 0 and 1, then between 1 and infinity using Odds Ratio
1 is our baseline point, as opposed to 0
What is the Odds Ratio formula
odds after 1 unit change/original odds
Original odds: P(event)/P(no event)
How can you interpret a positive Odds Ratio
If OR = 3.42, then the probability is 2.42 or 242%
more likely after 1 unit of change in the predictor
How can you interpret a negative Odds Ratio
If OR = .292, then you must convert it to positive -1/.292 = -3.42. Then the probability is -2.42 or -242%
What is the standardized coefficient of a logistic regression and why
there are none
Unlike mediation, that’s not something we’ll try to work around
It’s probably best for the situations where you’re likely to use logistic regression
because we are accepting categorical variables, standardizing variables doesn’t necessarily make sense
Describe the mixed ANOVA
includes at least two different kinds of independent variable and only one dependent variable.
There’s nothing theoretically new about this ANOVA.
You will have at least 2 main effects, and at least 1 interaction
All IVs are categorical
At least one uses the same participants (repeated measures)
At least one uses different participants (independent measures)
Any DVs are continuous
What kind of effect are you expected to find in a mixed ANOVA
In a Mixed ANOVA design, you get both main effects and interactions among your between- and within-subjects IVs.
Main Effects: An F for each IV
E.g., First Experience of Task (between) & Type of Experience (within)
Interactions: An F for each possible IV combination
E.g., First Experience x Type of Experience (2-way interaction)
What should you care the most about when interpreting the result of a mixed ANOVA
Interactions tell you there is a complex story to tell, where your outcome (degree of DV) depends on knowing where a person falls on all IVs involved. If present, they’re all you need to care about.
Once you find significant F statistics in a mixed ANOVA, what should you do next
To properly interpret a mixed ANOVA, you would follow up on your main effects and interactions just as you would for any other ANOVA – with simple effects analyses.
- Post-hoc tests are often the best method (recall: Tukey)
- Paired-samples t-tests could be required for repeated-measures IVs. To check an interaction, you may need to filter your file to perform these tests on only specific groups
- Independent-samples t-tests could be used for between-subjects IVs
- Marginal means tables are important for making figures
Why do we need to center our variables in the moderation analysis
In order to calculate the interaction variable, our data must be centered.
If it is not, we will have too much collinearity violating our assumptions for this model.
Why does it fix the problem? We don’t know, but the statistics demonstrate it
When should you worry about overdispersion
It’s only relevant when you have more than one predictor
How can you test for overdispersion
You can’t, but it is highly unlikely to be problematic. Especially with a large enough sample size
What type of variables for IV and DV do you need in mixed anova
All IVs are categorical
Any DVs are continuous