Lecture 9: Effect modification Flashcards
What is a modifier?
A modifier alters the relationship between the independent and dependent variable
The level of influence of the independent variable varies with different values of the modifier
E.g. the effect of exercise on weight loss - will be different for men/women
What is the interaction term?
The cross product of X1 and Z (the modifier)
This new product is added to the MLR
Y = b0 + b1x1 + b2z + b3 (x1 X z) + e
reject null hypothesis if b3 is significantly different to 0
- significant interaction effect or significant effect modification
How do you calculate an interaction effect in SPSS?
To work out an interatction effect in SPSS - compute new variable - include in a MLR
Check if B3 is significantly different to 0
How does the MLR interpretation vary if an interaction effect exists?
B1 and B2 - main effects
If B3 is significant there is an interaction effect - there is a different in effect of X1 on Y with different values of Z
How are effects estimated in the presence of an interaction?
No longer can just consider the main effects
- Only need to consider p values for main effects if interaction coefficient = 0
- therefore only use interaction term and the variable you want to predict (ignore the other main effect)
How do we assess interaction if there is more than two levels on the categorical variable?
Code them into dummy variables
- Transform and recode into different variables
Can interaction terms also be calculated for two continuous independent variables?
Yes
- as before compute a new variable by creating a cross-product term and include this in your MLR
Y = B0 + B1X1 + B2X2 + B3 (X1 * X2)
If an interaction existed
To work out the effect of X1 on Y - do the calculation:
Effect of B1 on Y = B1 + B3 x X2
where value of X2 would be provided (i.e work out effect of X1 on Y if X2 = 5)
How can Continuous x Continuous interactions be presented?
Continuous x Continuous interactions can be presented
- Tabular format using Q1, Q2, Q3
- Graphical format
What is an outlier?
An outlier is an observation that lies an abnormal distance from other values in a random sample from the population
Outliers can be identified by sorting cases, running descriptives or graphing data (boxplot, scatterplot and a regression variable plot (scatterplot with histogram ontop)
Are all outliers harmful?
No some may be more influential than others
Describe Tukey’s method for identifying outliers?
Uses the IQR (middle 50% of the dataset)
- Lower outer, lower inner, upper inner & upper outer
- Extreme outliers are more than 3 x above/below the 1st/3rd quartile
- Mild outliers are between 1.5-3 x above/below the 1st/3rd quartile
Lower Outer = Q1 - 3 x IQR
Lower inner = Q1 - 1.5 x IQR
Upper inner = Q3 + 1.5 x IQR
Upper outer = Q3 + 3 x IQR
How can outliers be identified using standard deviation?
- 27% data lies within -1/+1 SD from mean
- 45% data lies within -2/+2 SD from mean
- 73% of data lies within -3/+3 SD from mean
- -> Data would need to be normally distributed for this to happen
- -> Values larger than 2-3 more or less standard deviations from mean would be considered an outlier
What value of standardised residual would need to be considered an outlier?
If standardised residual is larger than 3 (+/-)
What are DFBeta and DFFIT?
DFBeta measures the change in the estimated coefficient Bj due to changing that observations
Standardised DFBeta = DFBeta / SE (est Bj)
DFFIT is the change in predicted value (i) due to deleting that observation = DFFIT / SE (i)
Outliers are considered if their absolute values DFFIT and DFBeta > 1
How is DFBeta calculated?Wh
Run two MLR with and without outlier
Work out the difference in the two Beta / SE (for the coefficient with outlier excluded)
Can also be calculated using SPSS - click SAVE