Lecture 9_Categorical Predictors in MLR Flashcards
What does it mean that a variable has “Interval-level properties”?
Values of the variable are directly comparable.
3 - 2 = 1
larger # = more (etc.)
What is a Nominal variable?
variables where numbers only indicate category or group membership (RACE: Hispanic = 1, Black = 2, etc.)
In MLR, why do we need a coding strategy for nominal predictor variables?
MLR treats all IVs in equation as having interval-level properties (a regression coefficient tells the amount Y is expected to change when X changes by 1 unit)
What are three different coding strategies?
Dummy coding (most frequent)
Effect coding
Orthogonal Contrast coding
With a nominal variable, when do you NOT need to use a coding strategy?
when the variable only has two categories [simply code as 0 and 1 (note, using 0 is useful)]
With a nominal variable (with 2 categories), what does the intercept from the regression table represent?
The mean of the outcome variable for the group coded 0.
With a nominal variable (with 2 categories), what does the slope from the regression table represent?
the difference in group means (ie. the change in Outcome when Predictor changes by 1)
Using a coding strategy for categorical variables in MLR will give the same results as what other type of statistical analysis?
Analysis of Variance (ANOVA) on group means
Because multiple linear regression can include both a categorical IV (using a coding strategy) and a continuous IV, it can also be used to conduct what type of statistical analysis?
Analysis of Covariance (ANCOVA)
How many dummy variables are needed to represent a categorical variable with (g) groups?
of new variables = (g - 1)
In the language of ANOVA:
the # of new variables = df (of the factor)
How do you interpret the regression coefficients (a, b₁, and b₂) of a MLR using a categorical predictor with 3 groups?
The intercept (Constant) is the expected value of DV when all predictors [D₁ and D₂] = 0, (i.e., for the reference group, Whites.)
b₁ is the amount DV is expected to change when D₁ changes from 0 to 1, (i.e., the difference between Hispanics and Whites).
b₂ is the amount DV is expected to change when D₂ changes from 0 to 1, (i.e., the difference between Blacks and Whites).
What is the advantage of “mean-centering” a continuous variable?
a value of 0 becomes meaningful as the average of the scores
What does an ANCOVA analysis compare?
ANCOVA compares differences in group means after adjusting for differences on a continuous covariate
What regression approach is used for analyzing a Categorical Predictor and a Continuous Predictor (ANCOVA)?
Sequential Approach required to get F-test and ΔR²
for the Race variable (as represented by two
dummy variables).
• Step 1: enter Age (mean-centered)
• Step 2: enter D₁ & D₂ (representing Race)
• Examine F-test for ΔR² with df = 2.
What happens to the SD and Variance of a mean-centered variable?
they are unchanged
How do you interpret the regression coefficients when analyzing a Categorical Predictor and a Continuous Predictor (ANCOVA)?
In Model 2 the coefficients may be interpreted as before, but are now adjusted for the covariate, AGEc.
The intercept is now the expected value of PD for Whites of average age.
D₁ is the amount PD is expected to change when D₁ changes from 0 to 1, (i.e., the difference between Hispanics and Whites, both of average age).
D₂ is the amount PD is expected to change when D₂ changes from 0 to 1, (i.e., the difference between Blacks and Whites, both of average age).
Compare and contrast using MLR for ANOVA and ANCOVA analyses.
Dummy variables are entered as a block to obtain the F-test and ΔR² value for the original categorical variable.
• The resulting regression coefficients for the dummy variables represent differences in the group means relative to the reference group (ANOVA).
• When other continuous variables are included in the model, these coefficients represent differences in adjusted group means (ANCOVA).