Final Exam Flashcards
The classical assumptions must be
met in order for OLS estimators to be the best available
Classical Assumption #1
The regression model is linear, is correctly specified, and has an additive error term
Classical Assumption #2
The error term has a zero population mean
Classical Assumption #3
All explanatory variables are uncorrelated with the error term
Classical Assumption #4
Observations of the error term are uncorrelated with each other (no serial correlation)
Classical Assumption #5
The error term has a constant variance (no heteroskedasticity)
Classical Assumption #6
No explanatory variable is a perfect linear function of any other explanatory variable(s) (no perfect multicollinearity)
Classical Assumption #7
The error term is normally distributed
Omitted Variable Bias (Conditions)
relevant (Ξ²2 β 0) β X1 and X2 are correlated
Expected bias
Expected bias in α π½1 has two components: the sign of Ξ²2 and the sign of Corr(X1, X2)
Limited Dependent Variables
We have discussed dummy variables (indicator variables, binary variables) as a tool for measuring qualitative/categorical independent variables (gender, race, etc.)
linear probability model
simply running OLS for a regression, where the dependent variable is a dummy (i.e. binary) variable:
where Di is a dummy variable, and the Xs, Ξ²s, and Ξ΅ are typical independent variables, regression coefficients, and an error term, respectively
the term, linear probability model
comes from the fact that the right side of the equation is linear while the expected value of the left side measures the probability that Di = 1
Some issues with LPM
ΰ· π π β€ 0 or ΰ· π π β₯ 1 a more fundamental problem with the linear probability model: nothing in the model requires ΰ· π to be between 0 and 1! ο€ If ΰ· π is not between 0 and 1, how do we interpret it as a probability? ο€ A related limitation is that the marginal effect of a 1-unit increase in any X is forced to be constant, which cannot possibly be true for all values of X ο€ E.g., if increasing X by 1 always increases Y by a particular amount, ΰ· π must exceed 1 when X is sufficiently large
The Binomial Logit Model
The binomial logit is an estimation technique for equations with dummy dependent variables that avoids the unboundedness problem of the linear probability model
Logits cannot be estimated using OLS
but are instead estimated by maximum likelihood (ML), an iterative estimation technique that is especially useful for equations that are nonlinear in the coefficients ο€ Again, for the logit model is bounded by 1 and 0
Basic Procedure for Random Assignment Experiments
ο€ Recruit sample of subjects ο€ Randomly assign some to treatment group and some to control group βΌ Random assignment makes treatment uncorrelated with individual characteristics ο€ Measure average difference in outcomes between treatment and control groups
natural experiments (or quasi-experiments
attempt to utilize the βtreatment-controlβ framework in the absence of actual random assignment to treatment and control groups
Difference-in-difference estimator:
Policy impact = (Tpost β Tpre) β (Cpostβ Cpre) ο€ T: treatment group outcome, C: control group outcome ο€ The DD estimate is the amount by which the change for the treatment group exceeded the change for the control grou
Panel data:
repeated observations of multiple units over time (combination of cross-sectional and time-series)
Main advantages of panel data
ο€ Increased sample size ο€ Ability to answer types of questions that cross-sectional and time-series data cannot accommodate ο€ Enables use of additional methods to eliminate omitted variables bias
Panel Data Notation
ο€ π subscript indexes the cross-sectional unit (individual, county, state, etc.) ο€ t subscript indexes the time period in which the unit is observe
First-differenced estimator
βπ π = πΌ0 + π½1βππ + βππ
Advantages of random effects estimator (if assumption about ππ is correct):
ο€ Allows time-invariant regressors to be included ο€ More degrees of freedom (only estimates parameters of distribution from which ππ is assumed to be drawn; fixed effects estimator uses one degree of freedom per fixed effect)
Disadvantages of random effects estimator:
ο€ Biased if assumption that ππ is uncorrelated with regressors is incorrect (while FE estimator allows arbitrary correlation between ππ and regressors)
β Fixed effects estimator widely preferred when regressors of interest are time-varying
ο€ It rarely seems likely that ππ is uncorrelated with any regressors; fixed effects model is generally far more convincing
Hausman test
β Fixed and random effect estimators can be compared with a Hausman test (previously seen in instrumental variables context as test for endogeneity)
Fixed vs. random effects - General Advice
ο€ Under random effects hypothesis, both RE and FE estimators are consistent (should give similar results); under alternative hypothesis, FE consistent but RE is not ο€ Therefore, if estimates are significantly different, can reject null hypothesis of random effects
Fixed vs. random effects - Concept
β General advice: use fixed effects estimator if itβs feasible
T-test
Divide the coefficient by the standard error to get the t-value
Omitted Variables β Bias Assessment
Sign (Ξ²2) * Sign (Corr (X1, X2)) = Sign of Bias (Ξ²1)
Irrelevant Variables - Inclusion Criteria
Theory: is there sound justification for including the variable?
Bias: do the coefficients for other variables change noticeably when the variable is included?
T-Test: is the variableβs estimated coefficient statistically significant?
R-square: has the R-square (adjusted R-square) improved?
Serial Correlation
First-order serial correlation occurs when the value of the error term in one period is a function of its value in the previous period; the current error term is correlated with the previous error term.
DW Test
compare DW(d) to the critical values (π_π, π_π)
Pure Heteroskedasticity
occurs in correctly specified equations
Impure Heteroskedasticity
arises due to model misspecification
Multicollinearity
Multicollinearity exists in every equation & the severity can change from sample to sample.
There are no generally accepted true statistical tests for multicollinearity.
VIF > 5 as a rule of thumb
Binary Dependent Variable Models
Linear Probability Model (LPM)
&
Logit / Probit Model
Linear Probability Model (LPM)
- Similar to OLS regression
- R-squared is no longer an accurate goodness-of-fit measure
- Interpretation: probability that Y=1 on a percentage point scale
Logit / Probit Model
- Restricted between 0 and 1
- Automatically corrects for heteroskedasticity
- Marginal effect of X is not constant
- Not linear in the coefficients
LPM Interpretation (example)
On average, a 1-unit increase in DISTANCE is associated with a negative 7.2 percentage point change in the probability of choosing Cedars Sinai, holding all else constant
LPM Limitation #1
Unboundedness
The linear probability model produces nonsensical forecasts (>1 and <0)
LPM Limitation #2
Adj-R^2 is no longer accurate measure of overall fit
LPM Limitation #3
Marginal Effect (slope) of a 1-unit increase in X is forced to be constant
LPM Limitation #4
Error term is neither homoskedastic nor normally distributed
Logit Model Interpretation -
Coefficient Interpretation:
The sign and significance can be interpreted just as in linear models. B1 is the effect of a 1-unit increase in X1 on the log-odds ratio
Calculating Marginal Effects
margin, dydx (X) at mean
LPM Interpretations
B*100 percentage point change in the probability that Y=1
Logit Interpretations
B change in the log-odds ratio of Y=1
Marginal Effects Interpretations
(dy/dx)*100 percentage point change in the probability that Y=1
Experimental Methods -
Selection Problems
Treatment is not necessarily randomly assigned because of other systematic differences in the error term (endogeneity) which would cause bias in the treatmentβs effect. We need a valid counterfactual to truly understand the effect of the intervention / treatment.
Valid Counterfactual
a control group that is exactly the same as the treatment group except it does not receive the treatment
Solution to Selection Problems
Randomization
-Researcher randomly assigns subjects to either a treatment or control group to estimate the treatment effect
Natural / Quasi-Experiments
Randomized experiments are hard to do in the social sciences, so researchers often rely upon natural experiments where an exogenous event mimics the treatment and control group framework in the absence of actual random assignment
Counterfactual Challenge
Counterfactual Challenge
Hard to find an untreated group that really is otherwise identical to the treated group
Panel Data - definitions expanded
Formed when cross-sectional and time-series data sets are combined to create a single data set. Main reason for working with panel data (beyond increasing sample size) is to provide insight into analytical questions that canβt be answered by using time-series or cross-sectional data alone
Panel Data Advantages
Increased sample size so more degrees of freedom & sample variability
Able to answer new research questions
Can eliminate omitted variable bias with fixed effects (controlling for unobserved heterogeneity)
Panel Data Concerns
Heteroskedasticity & Serial Correlation
Panel Data - Fixed Effects Model
Does a good job of estimating panel data equations, and it also helps avoid omitted variable bias due to unobserved heterogeneity.
Fixed Effects Model Assumptions
Each cross sectional unit has its own intercept. A fixed effects analysis will allow arbitrary correlation between all time-varying explanatory variables and π_π
Fixed Effects Model Drawback
measurement error, autocorrelation, heteroskedasticity
Fixed Effects Model
The omitted variable bias arising from unobserved heterogeneity can be mitigated with panel data and the fixed effects model.
How Fixed Effect Model address Omitted Variable Bias
How? Estimates panel data by including enough dummy variables to allow each cross-sectional of individual i and time period t to have a different intercept. These dummy variables absorb the time-invariant, individual-specific omitted factors in the error term
Panel Data - Random Effects Model
When to use
When the explanatory variable of interest is time-invariant
Panel Data - Random Effects Model
Assumption
Ai and regressors (Xit) are uncorrelated
Panel Data - Random Effects Model
Advantages
- Can handle time-invariant variables
- Uses fewer degrees of freedom than FE because of the lack of subject dummies
Hausman Test
Compares fixed and random effect estimators to see if their difference is statistically significant.
If different ο fixed effects model preferred (reject the null hypothesis of random effects)
If not different ο random effects model to conserve degrees of freedom
(or provide estimates of both the fixed effects and random effects models)
If both models predict VERY DIFFERENT results, it suggests RE model has omitted variable bias and endogeneity present, making FE more statistically accurate
What is the purpose of determining the Cookβs D, what is it used to detect?
Cookβs D is used to detect influential outliers.
Linear Probability Model Example
π»ππ = 0.42 + 0.028ππππ’ππ + 0.002ππππ’ππ
2 + 0.06π€ππποΏ½
Problems with Linear Probability Model
1) HSΰ·’i could be β€ 0 or β₯ 1. Linear probability model is difficult to interpret as a probability
because HSΰ·’i
is not bounded by 0 and 1. The linear probability model produces nonsensical
forecasts (greater than 1 and less than 0).
2) The marginal effect of a 1-unit increase in any X is forced to be constant, which cannot possibly
be true for all values of X.
3) π
ΰ΄€2
is no longer an accurate goodness-of-fit measure. The predicted values of Y are forced to
change linearly with X, so you could obtain a low π
ΰ΄€2
for an accurate model.
What is the difference between a random and natural experiment?
Random experiments involve researcherβs randomly assigned subjects to either a treatment or control
group to estimate the treatment effect. Natural experiments or quasi-experiments, attempt to utilize
the βtreatment-controlβ framework in the absence of actual random assignment to treatment and
control groups. Instead of researcher randomly assigning treatment, rely on some exogenous event to
create treatment and control groups. When the event or the policy is truly exogenous, treatment is as
good as randomly assigned.
Problems in random experiments.
1) Random experiments are often very costly or cannot be carried out due to being unethical.
2) Non-random samples. They often lack generalizability since the sample may not be randomly
drawn from the entire population of interest.
3) Attrition bias because treatment or control units non-randomly drop out of the experiment.
4) Hawthorne effects (people behave differently when observed, may respond to treatment/control
status).
5) Randomization failure (can only control for observed treatment-control differences; bias may
result if unobservable characteristics not perfectly balanced).
Explain briefly the difference-in-differences estimator
This method estimates the impact of a treatment by comparing the outcomes of a treatment
group and a control group before and after the treatment is received.
Main underlying assumption of difference-in-differences estimator
The main underlying
assumption: in the absence of the treatment, the difference between the outcomes of the two
groups would not have changed (i.e., they would have followed a common trend). The change
in outcomes of the control group is viewed as the counterfactual for the change in outcomes
of the treatment group.
What is panel data?
Panel data are repeated observations of multiple units over time. It is a combination of cross-sectional
and time-series.
Advantages of Panel Data
1) More degrees of freedom and more sample variability than cross sectional data alone or time
series data alone allow for more accurate inference of the model parameters and hence increase
the efficiency of estimates.
2) Eliminate omitted variables bias. It is often argued that the real reason someone finds an effect
is because of ignoring specific variables that are correlated with the explanatory variables when
specifying the model. Panel data allows us to control for missing or unobserved variables.
3) Ability to answer types of questions that cross-sectional and time-series data cannot
accommodate. For example transitions from employment to unemployment, from employment
to retirement, changes on health status or any other variables that can change through time.
differences between the fixed effects and random effects panel data
models.
The fixed effects model allows for ππ
to be correlated with the regressors and the random effects
estimator assumes ππ
is not correlated with the regressors.
Advantages of the random effects model
Advantages of the random effects model are that allows time-invariant regressors to be included and
it includes more degrees of freedom.
Disadvantages of the random effects model
The main disadvantage of random effects estimator is that is
biased if the assumption that ππ
is uncorrelated with regressors is incorrect.
Advantage of the fixed effects model
An advantage of the fixed effects model is that allows arbitrary correlation between ππ and any
regressors
Disadvantage of the fixed effects model
One of the main disadvantages is that drops out time-invariant regressors.
Preference between fixed effects model and random effects model
Unless we
wish to estimate the effect of a time invariant variable, fixed effects are generally preferred over
random effects due to having less restrictive assumptions.