Lecture 5 Flashcards
5 Reasons why your regression analysis mght not make sense
- Multicollinearity
- Heteroscadacity
- Outliers
- Reverse causality
- Omitted variable bias
Outlier
Data point that does not follow the general trend of the data, an extreme value
How can you solve outliers?
- What is the reason for the outlier? Can/ should you do something?
- Check how sensitive your results are to the presence of the outlier
- Throw the outlier out of the dataset when e.g. mismeasurement, error in the observation, data entry error.
Not because it is convenient to do so!!!
Reverse causality
When y (also) causes x, not only x causing y
How to check for reverse causality
- Timing of measurement (Sometimes, x is measured later than y)
- Statistical tests, to check wether changes in x precede changes in y
Omitted variable bias
An excluded variable has some effect on your DV and it is correlated with at least one of your IVs
How to solve omitted variable problem?
- Avoid simple regression models (with only one IV)
- Include variables that are most likely to be most important theoretically in explaining the DV
Three different types of data
- Cross-sectional data (Observations at a given point in time)
- Time series data
- Panel data (Observations over a period of time)
Robustness / sensitivity analysis
Determine how sensitive your results are to changes in the model. E.g. combinations of control variables, datasets, time frames.
Do the results remain? Results are robust
Requirements for causation (4)
- Time lag
- Correlation
- Previous research
- Test model for other explanations (robustness)
Possible problem when all variables are measured in one year
- Reversed causality
- Less of an argument for causality