Topic 8: Binary Variables Flashcards
Will having a binary dependent variable affect the error term’s distribution?
No, but it does affect the interpretation of the coefficients.
If your model has a binary explanatory variable which is equal to zero, what is the dependent variable equal to?
All the other explanatory variables, except for the one that is 0 (and it’s coefficient).
How do you get the difference in the expected value of yi for an model with a binary that is equal to 0 and a model with a binary equal to 1?
Remove any variables and their coefficients who’s binary variable is equal to 0, then subtract from the model where the binary is equal to 1. What remains is the difference in the expected value of yi.
How do you interpret the coefficient of an explanatory variable when the dependent variable is binary?
For a one unit increase in the explanatory variable, the probability of the dependent variable increases/decreases by the estimated coefficient amount.
How do you evaluate for the change in the dependent variable with respect to a single explanatory variable when other forms are included in the model (like exper and expersq)?
Take the derivative WRT the explanatory variable. If your model did include a squared version of that variable, there should be a non-squared version left in the derivative. Example: wage = .5 + 0.2educ + 0.3educ^2, the change in wage WRT educ is “0.2 + 0.9educ”
What is a potential problem with using cardinal variables?
Cardinal variables force the difference between each unit to be the same amount, which may or may not be an accurate representation of the rate of change.
What is a potential problem with using ordinal variables?
Having several binary ordinal variables can become unwieldy in a regression model with other control variables.
A regression of y on a constant with no explanatory variables will have an R-Squared value of?
Zero. R-Squared is a measure of how much of the dependent variable is explained by the independent. Since there are no independent variables, there is zero explanation.
What is an interaction term?
An independent variable in a regression that is the product of two explanatory variables.
Why would one want to use an interaction term between two binary variables?
To see the relationship between various combinations of binary variables. Be careful, this will change the interpretation of the other variables.
How would you interpret the affect of a squared term coefficient on the dependent variable?
If the squared term coefficient is positive, there are increasing marginal returns; if negative, there are diminishing marginal returns.
When testing the significance of a term which has a square in the regression, what is the null hypothesis?
The null hypothesis is that either one or the other term is not equal to 0.
How would you test in Stata the statistical significance of a term in a regression that includes that term’s square?
In Stata, run the regression then run “test var varsq” for the term of interest.
After running an F test on a term and it’s square, how would one analysis the outcome?
A small p value indicates statistical significance, rejecting the null hypothesis that either is equal to zero. A large p value indicates no statistical significance, failing to reject the alternative hypothesis that both are equal to 0.
How would one determine which of a set of binary terms would have the largest effect on the dependent variable of a regression?
Include all binary variables in the regression except for one which will serve as the base group, as well as other controlling variables. If the other two binary term coefficients are negative, the base group is the highest, if any are positive, the highest positive number has the largest effect.