lecture 5 - regression with categorical independent variables Flashcards
what is a dummy variable?
a variable used to represent a categorical variable in a regression model, but represents the categories as binary scores ( 1 or 0)
what is the reference or base category?
the category left at the end after all the binary variables have all been created - this is a group without a dummy
what does the regression equation look like with multiple Xs and bs?
Y = b1 X1 + b2X2 + u
phrases that refer to the usage of multiple bs and Xs to look at the change they have had to the previous mean
“controlling for”
“being held at their mean”
why are categorical variables problematic to use in regression models? and what it the solution?
they have no important order, therefore dummy variables are created
what is an underlying assumption when using dummy variables?
the relationship between other Xs and the dependent variable stay the same throughout - the slopes are parallel to one another
dummy variables allow what to vary between groups?
the intercept