- conditional mean independence assumption - E(u∣X)=0

- fits a linear line onto the data - estimates the parameters in such a way that the sum of the squared values of the residuals is minimized

Chapter 2 Flashcards by Jacqueline Ionescu

Formula dependent/explained/response variable y

How well did you know this?

Not at all

Perfectly

Assumption 1 u

E(u) = 0

we normalize unobserved factors to have on average a value of 0 in the population

How well did you know this?

Not at all

Perfectly

How should x be related to u (Assumption 1)

correlation coefficient: If x and u are uncorrelated then they are not linear
U is mean independent of x

How well did you know this?

Not at all

Perfectly

Assumption 2

conditional mean independence assumption
E(u∣X)=0

How well did you know this?

Not at all

Perfectly

To what can the violations of the conditiona mean independence assumption lead?

to biased parameter estimates and inefficient hypothesis tests in regression analysis

How well did you know this?

Not at all

Perfectly

What does the explanatory variable not contain (Assumption 2)?

information about the mean of the unobserved factors

How well did you know this?

Not at all

Perfectly

Populaton regression function: Formula/calculation

How well did you know this?

Not at all

Perfectly

How can the average value of dependent variable be expressed as (PFR)?

linear function of independent value

How well did you know this?

Not at all

Perfectly

How does one unit increase in x change the average value of y (PFR)?

by beta 1

How well did you know this?

Not at all

Perfectly

Describe OLS

fits a linear line onto the data
estimates the parameters in such a way that the sum of the squared values of the residuals is minimized

How well did you know this?

Not at all

Perfectly

Definition: Residual

actual value y minus the predicted value y where predicted value is based on the model aprameters

How well did you know this?

Not at all

Perfectly

Formula: Residual

How well did you know this?

Not at all

Perfectly

Formula: RSS

How well did you know this?

Not at all

Perfectly

Formula: fitted or predicted values

How well did you know this?

Not at all

Perfectly

Formula: Deviations from regression line (=residuals)

How well did you know this?

Not at all

Perfectly

What does the average of the residuals/deviations from regression equal to?

zero

How well did you know this?

Not at all

Perfectly

What does the covariance between residuals and regression equal to and what does it imply?

How well did you know this?

Not at all

Perfectly

OLS estimates are chosen to make the residuals add up to what, for any data set?

zero

How well did you know this?

Not at all

Perfectly

Poperty 1 of OLS means:

the average of residuals is zero

the sample average of the fitted values is the same as the sample average of the yi

How well did you know this?

Not at all

Perfectly

Formula: First alegabraic property of OLS regression

How well did you know this?

Not at all

Perfectly

Formula: second algebraic property of OLS regression

How well did you know this?

Not at all

Perfectly

Formula: third algebraic property of OLS regression

How well did you know this?

Not at all

Perfectly

What is SST?

a measure of the total sample variation in the yi that measures how spread out are the yi in the sample

How well did you know this?

Not at all

Perfectly

What does dividing SST by n-1 give us?

the sample variance of y

How well did you know this?

Not at all

Perfectly

Formula: total sum of squares

Formula: Explained sum of squares

Formula: Residual sum of squares

Decomposition of total variation

Goodness-of-fit measure (R-squared)

Value of Goodness-of-fit measure (R-squared)

between 1 and 0

OLS: What happens if data points all lie on the same line?

OLS perfect fit R squared = 1

What happens if R squared is close to zero?

- poor fit of OLS line - very little of the variation in the captured yi is capture by the variation in the ^yi, which all lie on the OLS regression line

True or False: High R squared means that regression has causal interpretation

False

Why are estimated regression coefficients random variables?

because they are calculated from a random sample

What are the assumptions for SLR?

1. Linear in parameters 2. Random sampling 3. Sample variation in the explanatory variable 4. Zero conditional mean 5. Homoskedasticity

Describe Assumption 1 of SLR

Describe Assumption 2 of SLR

Describe Assumption 3 of SLR

Describe Assumption 4 of SLR

What is a very weak SLR Assumption and why?

Assumption 3 - there is varion in xi

What is a very strong SLR Assumption and why?

Assumption 4 - condition on x i, E(u i) = 0

SLR 2 + SLR 4:

Fixed in repeated samples: why are they not always very realistic?

- one does not choose values of education and then searches for individuals with those values

How can we treat xi if we assume SLR and SLR 2?

as nonrandom

True or False: SLR1-SLR4, OLS estimators are unbiased

True

Interpretation of unbiasedness of OLS: what is crucial?

Assumptions SLR1-SLR4

Interpretation of unbiasedness of OLS: how can the estimated coefficients be?

smaller or larger, depending on the sample that is the result of a random draw - on average equal to the values that characterize the true relationship between y and x in the population

What does "on average" mean in the context of SLR unbiasedness?

= if sampling was repeated ie if drawing the random sample and doing the estimation was repeated many times

True or False: in a given sample, estimates may differ considerably from true values (SLR unbiasedness)

True

Formula: SLR5

What role does SLR5 play in showing the unbiasedness?

- it plays no role - it simplifies the variance calculations

What is sigma squared (SLR5)?

- the unconditional variance of u - the error variance

Formula: sigma squared (SLR5)

Formula: Summarizing SLR4 and SLR5

What does Homoskedasticity mean?

the variance of the errors is constant across all elvels of the independent value(s)

What happens when Homoskedasticity is satisfied?

- OLS estimators are unbiased and efficient - hypothesis tests and confidence intervals are valid

What does Heteroskedasticity mean?

the variance of the errors is nto constant across all elvels of the independent variable(s)

How are the OLS estimators in the presence of Heteroskedasticity?

- still unbiased - no longer efficient - leading to inefficient standard errors

What are the methods of testing homoskedasticity in Sata?

1. Visual Inspection 2. Breusch-Pagan test 3. White test

Describe Visual inspection (method of testing for homoskedasticity in Sata)

- predic residuals (r, res) - plot your residuals against your independent variables (scatter r x) - in case of a multivariate regression predict the fitted values (yhat, xb) and plot the residuals against the fitted values

Describe the Breusch-Pagan test (method of testing for homoskedasticity in Sata)

- we test whether the estimated variance of the residuals are dependent on the values of the independent variables - run the regression and type "estat hettest" directly after the reg command

Describe the white test (method of testing for homoskedasticity in Sata)

- similar to the BP test - allows the independent variables to have a nonlinear effect on the error variance - run the regression and type "imtest, white" directly after the reg command

Under SLR1-SLR5 we obtain a variance of OLS estimators (formula and explanations)

Problem: the error variance is unknown. Why?

- we do not know what error variance is because we do not observe the erros, ui - what we observe are the residuals - luckily we can use these residuals to form an estimate of the error variance

Theorem 2.3 (Unbiasedness of error variance): Formula & explanation

Compare SE for beta and mean: Formula & Explanation

True of False: another OLS assumption is that the error terms or even the dependent or independent variables are normally distributed

False - OLS only requires errors to be i.i.d., but normality is required neither for unbiased and efficient OLS estimates nor for the calculation of standard errors

What is necessary for convenient hypohesis testing?

a normal distribution

When the errors are normally distributed... ?

- the test statistic follows a t-distribution - we can use familiar cut-off values

Chapter 2 Flashcards

(70 cards)