L24 - Dummy Variables and Limited Dependent Variables Flashcards
What is a Dummy Variable?
Dummy variables are variables which can only take on a limited range of values.
The most common type of dummy variable is a zero-one dummy.
Dummy variables can be used to :
- Allow for qualitative effects in data e.g. male/female.
- Allow for structural breaks in time-series data e.g a change in the mean value of a series.
- Allow for regular patterns in time-series data e.g. seasonal effects.
How can a dummy variable be used for an intercept shift?
- intercept dummy variable
How do you use a dummy variable for changes in slopes?
How else can you write a model with a structural break?
- this form is useful because we can perform a test for parameter change/ parameter stability
How do you perform the Chow Test for Parameter Instability?
- This is useful if we have some outside knowledge of where the structural break actually takes place - if we don’t have this it will be a problem
How would you perform a regression with seasonal effects?
Why do we only include 3 quarterly dummy variable in a seasonal effect regression?
If we include four quarterly dummy variables plus a constant then there will be a perfectly collinear relationship between a subset of the RHS variables.
C (intercept included in regression) = Q1 + Q2 + Q3 + Q4
This means that the X’X matrix will not be of full rank (couldnt have an inverse). It, therefore, cannot be inverted to construct the OLS estimator.
We can either include four quarterly dummy variables and no constant or a constant plus three quarterly dummy variables - most regression statistics assume a constant in the equation however
What is a model with limited dependent variables?
Suppose we wish to estimate the effect of changes in inflation on interest rate policy. We start with a model of the form:
Yi = α + βXi + ui
where Y is zero if the interest rate is held constant and 1 if the interest rate is increased. X is the annual inflation rate
The Y variable is, therefore, a dummy variable i.e. can only take on the values 0 or 1.
This creates a number of problems for interpretation of the standard regression model
Why do limited dependent variables models cause a problem?
- The scatter plot of the data and regression equation
- Does not fit the data at all
- can actually predict negative probabilities for low inflation rates
- lead to heteroscedasticity in the errors
How do limited dependent variables models cause heteroscedasticity in the errors?
How can you deal with the problems caused by limited dependent variable models?
- give a sigmoid (S-shape) which becomes more pronounced when beta increases unlike our normal linear function
- To solve this problem we will take a maximum likelihood approach
- Maximum likelihood estimation is an alternative to least squares as a method of constructing estimates of unknown parameters.
- It starts with the assumption of a given distribution for the data/errors.
- We then choose the values of the model parameters that maximise the likelihood function i.e. the probability of observing the given sample of the data expressed as a function of the unknown parameters.
- Maximum likelihood estimation often requires the use of numerical methods to maximise the likelihood function.
What are the steps to solve the limited dependent model?
- choose the value of β that maximises the likelihood function
How can we simplify the likelihood method to solve limited dependent variables model?
- changes it from the product of all the probabilities to the sum of all the probabilities