Assessing studies based on multiple regression Flashcards
Threats to external validity
(2)
- Differences in populations - when the population being studied is not relevant for understanding populations in other areas
- Differences in settings/contexts - assuming the population being studied is generalisable accross different contexts e.g. impacts of changing class sizes in unimelb is not generalizable to ANU as there are different contexts
Threats to internal validity
- Unbiasedness and consistency of estimators - threat of violation of any of the least squares assumptions
- Hypothesis tests and confidence intervals should be correct - threat of inaccuracy of tests and intervals in small samples and heteroskedasticity
Omitted variable bias
Threat to internal validity with multiple regression analysis
Emerges when there is a variable not included in the regression that is correlated with the dependent variable and with one of the independent variables causing the independence assumption to fail
Misspecification of the regression function
Threat to internal validity with multiple regression analysis
Functional form misspecification of the regression function (wrong order) yield biased OLS estimates. How to avoid:
* Plot/visualise the data using scatter plots to see if nonlinear relationships potentially exist
* Tru linear and non-linear specifications and test whether quadratic, cubic etc.
Measurement error and errors-in-variables bias
Threat to internal validity with multiple regression analysis
Errors in the measurement of data e.g. errors-in-variables bias in using economics test scores when wanting to test maths scores which although likely correlated are not the same
Classical measurement error
Threat to internal validity with multiple regression analysis
??
Missing data and sample selection bias
Threat to internal validity with multiple regression analysis
If data is missing as a function of the dependent variable then there is a risk that X is correlated with u, and omitted variable bias is present - this is known as sample selection bias
Simultaneous casuality
Threat to internal validity with multiple regression analysis
Occurs if the Y is causing X ie. reverse causality from which arises the issue of simultaneous causality which is another way in which omitted variable bias can arise.
Sources of inconsistency in OLS standard errors
Threat to internal validity with multiple regression analysis
- Heteroskedasticity - when the variance of u depends on X which is not accounted for whill result in the computation of incorrect standard errors and hence incorrect t-stats and confidence intervals
- Correlation of the error term across observations - if the eroor term is correlated across time or space you can also end u with incorrect standard errors, t-statistics and confidence intervals.
Forecasting
We can generate a forecast of Y using predicted values from the regression with forecast error = Y - Y(hat)
- This arises issues of external validity as we need to ensure that the underlying population drawn can generalise in such a way that we can forecast outcomes in other situations/contexts