Biases Flashcards

1
Q

what is Heteroscedasticy

A
  • error term doesn’t have a constant variance.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Multicollinearity

A
  • When independent variables highly correlate.
  • This can affect the value of B1 and B2 as we dont get their real val
  • LEADS TO UNBIASED STANDARD ERRORS
  • And Affected Coefficient values
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is heteroscedasticity and homoscedasticity?

A

The error term u is homoscedastic if the variance of the conditional distribution of u given X is constant and does not depend on X. Otherwise, the error therm is heteroskedastic.

Homosced: The error has a constant variance
Heterosced: The error has not a constant variance.

The distribution of the errors u is for various values of X. imagine a plot where the variance is large and one where it is small and compact.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the problems of working with heteroscedastic data?

A

Parameters will be unbiased, but variance estimator will be inconsistent. One solution is to use White’s robust variance estimator. Using White’s estimator on homoscedastic data will however give worse finite sample properties and increases likelihood of size distortions. Another solution to heteroscedastcity is to use GLS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

.

A

.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a type 1 error

A

Rejecting a true null

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is meant by unbiasedness of an estimator?

A
  • estimator whose expected value is equal to the population value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is multicollinearity, and how can we test for it?

A

Perfect multicorr uccurs if two or more regressors are perfectly correlated. In reality, we will not often see two regressors that are perfectly correlated. That is why it most often occurs from the dummy trap or by including the same regressor twice. Can use Volatility inflation factor to test if there is multicollinearity. A rule of thumb is that there is multicollinearity if VIF > 10. The solution to this problem is simply just to drop the variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the problems and solutions with heteroscedasticy?

A

PROBLEM:
Coefficients are unbiased and consistent
Standard errors are biased
OLS t statistic does not follow a t distribution
(Fail to) reject H0 too often or not often enough

SOLUTION
Use heteroskedasticity robust standard errors
Prudent to assume errors are heteroskedasticity unless there is acompelling reason
Implementation see Lab example

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Whats the difference between biases and heterosced + multicorr?

A

They lead to violation of LS.1, hence the coefficient is biased. In contrast, heteroscedasticy and multicollinearity lead to biased standard errors, not biased errors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is omitted variable biases?

A
  • occurs when a statistical model leaves out one or more relevant variables
  • Not included in model, but are affecting dependent variable
  • Zero mean assumption Violated

.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Simultanely bias

A
  • one or more of the independent variables are jointly determined with the dependent variable.
  • X causes Y but Y causes also X
  • two variables on either side influence each other
  • dont give the real causal effect
  • Violate mean of zero assumption
Supply/demand a good example.
Quantity and price
Investments and Productivity
Sales and advertisement 
This leads to violation of LS.1, hence our coefficient is biased.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Sample Selection bias

A

A type of bias that arises by choosing non-random data for statistical analysis. For example when people volunteer for a study. Those who volunteer might share the same characteristics.

For example, you want to study the context between veganism and undergraduate students. You send out a survey to the students in class of art and culture. Because this is not a random draw sample, it is not representative for the target population. These students might be more liberal etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Measurement error in independent variable

A
  • There are often error in the data

Feks:

  • Reporting error
  • Coding error
  • Estimation error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

2 good examples of omitted variable bias in wage education

A

Education of individual’s parents,

Ability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

how is B(hat) distribution if it is unbiased

A

the sampling distribution of βhat is centred around β

17
Q

what is stationarity

A
  • no trends or seasonality
  • its statistical properties does not change over time
  • constant mean and variance
18
Q

What is a Type I Error?

What is a Type II Error?

A

1 - Fail to reject Null Hyp when you should have done it

2 - You don’t reject H0 when you should

19
Q

What is perfect multicollinearity?

A

A phenomenon in which one predictor variable in a multiple regression model can be linearly predicted from the others with a substantial degree of accuracy. Generally, if we observe few significant t-ratios, but high R^2.

20
Q

What are the consequences of high, but non-perfect multicollinearity?

A
  • Large Variances and Covariances
  • Wider Confidence intervalls
  • r^2 tends to be very high
  • Reduces precission of the estimated coefficients, which weakens your model
  • might not trust your model

OLS is still BLUE but: Large variances and covariances, precise estimation difficult, wider confidence intervals, t-ratio tends to be statistically insignificant, R^2 tends to be very high, OLS estimators and standard errors can be sensitive to small changes in data

21
Q

What does heteroscedasticy lead to?

A
  • Coefficient doesn’t change
  • But it leads to biased Standard Errors (SER)
  • Biased SER makes hypothesis testing, t-test, p-vaules etc impossible
  • Not BLUE or Gauss-Markov
22
Q

What can you do about heteroscedasticy

A
  • you can use clustered standard errors
23
Q

How does perfect Multicollinearity occur?

A
  1. Dummy-trap

2. Include a variable twice

24
Q

What does multicollinearity lead to? both perfect and not-perfect

A
  • Large SER
  • Affected Coeff

If it is PERFECT, it will violate assumption for multiple reg

25
Q

What is Clustered Standard Errors?

A
  • allow the regression errors to correlate within a cluster (entity),
  • but assume that the regression is uncorrelated across clusters
  • Clustered SER allow for heterosced and autocorr
26
Q

Pooled OLS: what to do with SER

A

Can use a Robust Regression to fix the SER

27
Q

BLUE for B1(hatt) stands for?

A

Best
Linear combination
Unbiased
Estimator

28
Q

Assumptions when sample size is small / homoskedastic normal regression assumptions?

A
  • if 3 assumptions from simple linear holds:
  • the regression is homosced and SER are normally distributed
  • use t-stat with t-distributionj
29
Q

What type of biases do we have?

A
  1. Sample Selevtion bias
  2. Omitted variable bias
  3. Simultaneity bias
  4. Measurment error in independent variable
  • ALL THESE LEADS TO VIOLATION OF LS.1
  • biased coefficients
30
Q

What is Measurment error in independent variable?

A

Data is often measured in error

  • coding error
  • reporting error
  • estimation error
  • Violated LS.1
  • measurment error en dep var not a problem
31
Q

Why do we want time-series to be stationary?

A
  • nonstationary have undefined mean and infinite variance. Makes very biased answers