Biases Flashcards
what is Heteroscedasticy
- error term doesn’t have a constant variance.
What is Multicollinearity
- When independent variables highly correlate.
- This can affect the value of B1 and B2 as we dont get their real val
- LEADS TO UNBIASED STANDARD ERRORS
- And Affected Coefficient values
What is heteroscedasticity and homoscedasticity?
The error term u is homoscedastic if the variance of the conditional distribution of u given X is constant and does not depend on X. Otherwise, the error therm is heteroskedastic.
Homosced: The error has a constant variance
Heterosced: The error has not a constant variance.
The distribution of the errors u is for various values of X. imagine a plot where the variance is large and one where it is small and compact.
What are the problems of working with heteroscedastic data?
Parameters will be unbiased, but variance estimator will be inconsistent. One solution is to use White’s robust variance estimator. Using White’s estimator on homoscedastic data will however give worse finite sample properties and increases likelihood of size distortions. Another solution to heteroscedastcity is to use GLS
.
.
What is a type 1 error
Rejecting a true null
What is meant by unbiasedness of an estimator?
- estimator whose expected value is equal to the population value
What is multicollinearity, and how can we test for it?
Perfect multicorr uccurs if two or more regressors are perfectly correlated. In reality, we will not often see two regressors that are perfectly correlated. That is why it most often occurs from the dummy trap or by including the same regressor twice. Can use Volatility inflation factor to test if there is multicollinearity. A rule of thumb is that there is multicollinearity if VIF > 10. The solution to this problem is simply just to drop the variable
What are the problems and solutions with heteroscedasticy?
PROBLEM:
Coefficients are unbiased and consistent
Standard errors are biased
OLS t statistic does not follow a t distribution
(Fail to) reject H0 too often or not often enough
SOLUTION
Use heteroskedasticity robust standard errors
Prudent to assume errors are heteroskedasticity unless there is acompelling reason
Implementation see Lab example
Whats the difference between biases and heterosced + multicorr?
They lead to violation of LS.1, hence the coefficient is biased. In contrast, heteroscedasticy and multicollinearity lead to biased standard errors, not biased errors.
What is omitted variable biases?
- occurs when a statistical model leaves out one or more relevant variables
- Not included in model, but are affecting dependent variable
- Zero mean assumption Violated
.
Simultanely bias
- one or more of the independent variables are jointly determined with the dependent variable.
- X causes Y but Y causes also X
- two variables on either side influence each other
- dont give the real causal effect
- Violate mean of zero assumption
Supply/demand a good example. Quantity and price Investments and Productivity Sales and advertisement This leads to violation of LS.1, hence our coefficient is biased.
Sample Selection bias
A type of bias that arises by choosing non-random data for statistical analysis. For example when people volunteer for a study. Those who volunteer might share the same characteristics.
For example, you want to study the context between veganism and undergraduate students. You send out a survey to the students in class of art and culture. Because this is not a random draw sample, it is not representative for the target population. These students might be more liberal etc.
Measurement error in independent variable
- There are often error in the data
Feks:
- Reporting error
- Coding error
- Estimation error
2 good examples of omitted variable bias in wage education
Education of individual’s parents,
Ability