- one or more of the independent variables are jointly determined with the dependent variable. - X causes Y but Y causes also X - two variables on either side influence each other - dont give the real causal effect - Violate mean of zero assumption ``` Supply/demand a good example. Quantity and price Investments and Productivity Sales and advertisement This leads to violation of LS.1, hence our coefficient is biased. ```

Biases Flashcards by Hans SSS

what is Heteroscedasticy

error term doesn’t have a constant variance.

How well did you know this?

Not at all

Perfectly

What is Multicollinearity

When independent variables highly correlate.
This can affect the value of B1 and B2 as we dont get their real val
LEADS TO UNBIASED STANDARD ERRORS
And Affected Coefficient values

How well did you know this?

Not at all

Perfectly

What is heteroscedasticity and homoscedasticity?

The error term u is homoscedastic if the variance of the conditional distribution of u given X is constant and does not depend on X. Otherwise, the error therm is heteroskedastic.

Homosced: The error has a constant variance
Heterosced: The error has not a constant variance.

The distribution of the errors u is for various values of X. imagine a plot where the variance is large and one where it is small and compact.

How well did you know this?

Not at all

Perfectly

What are the problems of working with heteroscedastic data?

Parameters will be unbiased, but variance estimator will be inconsistent. One solution is to use White’s robust variance estimator. Using White’s estimator on homoscedastic data will however give worse finite sample properties and increases likelihood of size distortions. Another solution to heteroscedastcity is to use GLS

How well did you know this?

Not at all

Perfectly

How well did you know this?

Not at all

Perfectly

What is a type 1 error

Rejecting a true null

How well did you know this?

Not at all

Perfectly

What is meant by unbiasedness of an estimator?

estimator whose expected value is equal to the population value

How well did you know this?

Not at all

Perfectly

What is multicollinearity, and how can we test for it?

Perfect multicorr uccurs if two or more regressors are perfectly correlated. In reality, we will not often see two regressors that are perfectly correlated. That is why it most often occurs from the dummy trap or by including the same regressor twice. Can use Volatility inflation factor to test if there is multicollinearity. A rule of thumb is that there is multicollinearity if VIF > 10. The solution to this problem is simply just to drop the variable

How well did you know this?

Not at all

Perfectly

What are the problems and solutions with heteroscedasticy?

PROBLEM:
Coefficients are unbiased and consistent
Standard errors are biased
OLS t statistic does not follow a t distribution
(Fail to) reject H0 too often or not often enough

SOLUTION
Use heteroskedasticity robust standard errors
Prudent to assume errors are heteroskedasticity unless there is acompelling reason
Implementation see Lab example

How well did you know this?

Not at all

Perfectly

Whats the difference between biases and heterosced + multicorr?

They lead to violation of LS.1, hence the coefficient is biased. In contrast, heteroscedasticy and multicollinearity lead to biased standard errors, not biased errors.

How well did you know this?

Not at all

Perfectly

What is omitted variable biases?

occurs when a statistical model leaves out one or more relevant variables
Not included in model, but are affecting dependent variable
Zero mean assumption Violated

How well did you know this?

Not at all

Perfectly

Simultanely bias

one or more of the independent variables are jointly determined with the dependent variable.
X causes Y but Y causes also X
two variables on either side influence each other
dont give the real causal effect
Violate mean of zero assumption

Supply/demand a good example.
Quantity and price
Investments and Productivity
Sales and advertisement 
This leads to violation of LS.1, hence our coefficient is biased.

How well did you know this?

Not at all

Perfectly

Sample Selection bias

A type of bias that arises by choosing non-random data for statistical analysis. For example when people volunteer for a study. Those who volunteer might share the same characteristics.

For example, you want to study the context between veganism and undergraduate students. You send out a survey to the students in class of art and culture. Because this is not a random draw sample, it is not representative for the target population. These students might be more liberal etc.

How well did you know this?

Not at all

Perfectly

Measurement error in independent variable

There are often error in the data

Feks:

Reporting error
Coding error
Estimation error

How well did you know this?

Not at all

Perfectly

2 good examples of omitted variable bias in wage education

Education of individual’s parents,

Ability

How well did you know this?

Not at all

Perfectly

how is B(hat) distribution if it is unbiased

Study These Flashcards

the sampling distribution of βhat is centred around β

what is stationarity

Study These Flashcards

no trends or seasonality
its statistical properties does not change over time
constant mean and variance

What is a Type I Error?

What is a Type II Error?

Study These Flashcards

1 - Fail to reject Null Hyp when you should have done it

2 - You don’t reject H0 when you should

What is perfect multicollinearity?

Study These Flashcards

A phenomenon in which one predictor variable in a multiple regression model can be linearly predicted from the others with a substantial degree of accuracy. Generally, if we observe few significant t-ratios, but high R^2.

What are the consequences of high, but non-perfect multicollinearity?

Study These Flashcards

Large Variances and Covariances
Wider Confidence intervalls
r^2 tends to be very high
Reduces precission of the estimated coefficients, which weakens your model
might not trust your model

OLS is still BLUE but: Large variances and covariances, precise estimation difficult, wider confidence intervals, t-ratio tends to be statistically insignificant, R^2 tends to be very high, OLS estimators and standard errors can be sensitive to small changes in data

What does heteroscedasticy lead to?

Study These Flashcards

Coefficient doesn’t change
But it leads to biased Standard Errors (SER)
Biased SER makes hypothesis testing, t-test, p-vaules etc impossible
Not BLUE or Gauss-Markov

What can you do about heteroscedasticy

Study These Flashcards

you can use clustered standard errors

How does perfect Multicollinearity occur?

Study These Flashcards

Dummy-trap

2. Include a variable twice

What does multicollinearity lead to? both perfect and not-perfect

Study These Flashcards

Large SER
Affected Coeff

If it is PERFECT, it will violate assumption for multiple reg

What is Clustered Standard Errors?

- allow the regression errors to correlate within a cluster (entity), - but assume that the regression is uncorrelated across clusters - Clustered SER allow for heterosced and autocorr

Pooled OLS: what to do with SER

Can use a Robust Regression to fix the SER

BLUE for B1(hatt) stands for?

Best Linear combination Unbiased Estimator

Assumptions when sample size is small / homoskedastic normal regression assumptions?

- if 3 assumptions from simple linear holds: - the regression is homosced and SER are normally distributed - use t-stat with t-distributionj

What type of biases do we have?

1. Sample Selevtion bias 2. Omitted variable bias 3. Simultaneity bias 4. Measurment error in independent variable - ALL THESE LEADS TO VIOLATION OF LS.1 - biased coefficients

What is Measurment error in independent variable?

Data is often measured in error - coding error - reporting error - estimation error - Violated LS.1 - measurment error en dep var not a problem

Why do we want time-series to be stationary?

- nonstationary have undefined mean and infinite variance. Makes very biased answers

Biases Flashcards

(31 cards)