Final COPY Flashcards

Question 1

Q

What is the stochastic error term?

Answer

A

A term added to the regression equation to account for any variation in Y that is not explained by X

Question 2

Q

What does TSS equal?

Question 3

Q

What is the SE?

Answer

A

Sq.Rt. of RSS/{(n-k-1) ∑(Xi-Xbar)²}

Question 4

Q

What are the four assumptions of the error term?

Answer

A

The variation does not change as X changes 2. Its distribution is normal 3. Conditional mean=0 4. Independent for any two observations (i,j)

Question 5

Q

What can cause stochastic error?

Answer

A

OVB 2. Measurement error 3. A misspecified function 4. Random occurrences

Question 6

Q

When are dummy variables useful?

Answer

A

When we want to quantify something that is inherently qualitative (race, gender, etc.)

Question 7

Q

What is Ŷ?

Answer

A

An estimated value of Y calculated from the regression at the i-th observation

Question 8

Q

What is a residual?

Answer

A

The difference between the estimated and actual values of the dependent variable

Question 9

Q

What changes between observations?

Answer

A

The values of Y, Xs, and error terms (but not the coefficients)

Question 10

Q

While adding a variable may not change TSS…

Answer

A

It will likely reduce SSR and, thus likely increase R-squared

Question 11

Q

OLS seeks to minimize….

Answer

A

The sum of squared residuals (or SSE)

Question 12

Q

K = ?

Answer

A

The # of independent variables

Question 13

Q

Why is a high degree of freedom desired?

Answer

A

It is likely that the errors will balance out

Question 14

Q

Y_i - Ŷ is….

Answer

A

The residual (prediction mistake)

Question 15

Q

What are the three properties of estimators?

Answer

A

Unbiasedness: The estimator is correct (on avg.) 2. Consistency: As observations increase, so does the probability that the estimator is close to the pop. parameter Efficiency: Estimator has smaller relative variance (converges to the pop. parameter more quickly)

Question 16

Q

How do we adjust for degrees of freedom?

Answer

A

Divide by (n-1)

Question 17

Q

What does ß₀ equal? (Univariate)

Answer

A

Ybar - ß₁(Xbar)

Question 18

Q

What does ß₁ equal? (Univariate)

Answer

A

∑(Y_i-Ybar)(X_i-Xbar)/∑(X_i-Xbar)²

Question 19

Q

What is the formula for sample variance?

Answer

A

1/(n-1) ∑(X_i-Xbar)²

Question 20

Q

What is the formula for sample covariance?

Answer

A

1/(n-1) ∑(X_i-Xbar)(Y_i-Ybar)

Question 21

Q

What are the 7 OLS assumptions?

Answer

A

The population regression function (DGP) is linear in parameters 2. Observations are randomly drawn from the population and i.i.d 3. X[vector] is fixed in repeated samples (no measurement error) 4. The error term has a conditional mean of 0 5. Homoskedasticity 6. Errors are independent (for every i, j) 7. Outliers are unlikely

Question 22

Q

If you specify a dummy variable for each possible outcome…..

Answer

A

You will induce perfect multicollinearity (nothing to compare dummy to)

Question 23

Q

If the omitted variable is correlated with a regressor and it has an effect on the dependent variable….

Answer

A

We have OVB

Question 24

Q

If the effect of the OV on Y and the correlation between OV and regressor are moving in the same direction…

Answer

A

Your estimator is too big

Question 25

Q

How do you know if you have OVB?

Answer

A

Use economic theory/knowledge of the subject 2. Run a robustness check

Question 26

Q

What is ESS?

Answer

A

Sum of the squared differences between predicted values and the mean

Question 27

Q

What is TSS?

Answer

A

Sum of the squared differences between actual values and the mean

Question 28

Q

What is R²?

Answer

A

The % of variation in Y explained by the model

Question 29

Q

What are the formulas for R²?

Answer

A

ESS/TSS 2. 1 - SSR/TSS

Question 30

Q

What is the formula for the adj. R²?

Answer

A

1 - [(n-1)/(n-k-1)] x SSR/TSS

Question 31

Q

What is RMSE?

Answer

A

Another goodness of fit measurement 2. Sqrt of SSR/(n-k-1)

Question 32

Q

What happens if we have perfect multicollinearity?

Answer

A

OLS is impossible

Question 33

Q

Why is perfect (or high) multicollinearity an issue?

Answer

A

A high degree of multicolinearity may be problematic because it inflates the variance of the estimator

Question 34

Q

What can we use to quantify the severity of multicollinearity in our model?

Answer

A

We use the variance inflation factor (VIF)

Question 35

Q

What is the VIF(ß₁ hat?)

Answer

A

1/(1-R²)

Question 36

Q

Formula for t-test?

Answer

A

(ß₁ hat)-(H₀: ß₁) / SE(ß₁ hat) p = 2(cdf)(-l t l)

Question 37

Q

What is the extensive formula for R²?

Answer

A

(ß₁) ∑(X_i-Xbar)(Y_i-Ybar)/∑(Y_i-Ybar)²

Question 38

Q

What is the extensive formula for ESS?

Answer

A

(ß₁)² ∑(X_i-Xbar)²

Question 39

Q

How do you standardize a normal distribution?

Answer

A

Subtract the mean and divide by sigma

Question 40

Q

For a hypothesis test, what is the significance level and confidence level?

Answer

A

Significance = p Confidence = 1-p

Question 41

Q

What must “t” be greater than equal to for: 1. 90% confidence 2. 95% confidence 3. 99% confidence

Answer

A

1) 1.645 2) 1.96 3) 2.58 ^^ For two-tailed tests

Question 42

Q

If our causal effect depends on the level of another independent variable, what do we do?

Answer

A

Take the natural log of the variable

Question 43

Q

If our causal effect depends on another variable (but not the level) what do we do?

Answer

A

Use an interaction terms

Question 44

Q

What assumption do we make about the causal effect if we use a natural log?

Answer

A

That it is always positive or negative

Question 45

Q

What are the advantages of using a non-linear specification other than a log?

Answer

A

No assumptions about direction, allows for inflection points, and increasing/decreasing rates

Question 46

Q

What property about our OLS estimator is violated if we have homoskedasticity?

Answer

A

Efficiency

Question 47

Q

What type of GLS do we use when the form of heteroskedasticity is unknown?

Answer

A

Feasible GLS

Question 48

Q

How do we run feasible GLS?

Answer

A

Estimate w/ OLS and calculate residuals 2. Run OLS w/ squared residuals on variance 3. Use predicted values from that to create weight (1/sq.rt(predicted values))

Question 49

Q

What is iteratively re-weighted least squares?

Answer

A

Feasible GLS repeated until weights converge to a value

Question 50

Q

Which hypothesis test do we use for the following situations?: 1. Single parameter (one restriction) 2. Multiple parameter (linear combination) 3. Multiple parameter (non-linear combination) 4. Multiple parameter (multiple restrictions)

Answer

A

t-test (of one variable) 2. lincom (or t-test comparing two variables) 3. t-test/Taylor approximation (if one restriction) 4. F-test

Question 51

Q

What is the formula for an F-test?

Answer

A

(SSRr - SSR_u/r) ÷ (SSR_u/n-k-1)

Question 52

Q

What do you need to look at after running an F-test to determine if it is statistically significant?

Answer

A

The Chi-square critical values

Question 53

Q

What is the advantage of BIC over AIC?

Answer

A

BIC gives you consistent estimates

Question 54

Q

What should you do if you are asked about an effect? A change?

Answer

A

Effect = derivative Change = difference

Question 55

Q

What do you do if asked which level of x has a max effect on y?

Answer

A

Take the derivative and solve for x

Question 56

Q

What do you do if asked about the effect of x on y for a person w/ z years of x?

Answer

A

Plug z into x and solve **(Final answer * Ɛ)

Question 57

Q

What do you do if asked about the difference in y due to a difference in x?

Answer

A

Plug in given values and take the difference

Question 58

Q

What do you do if asked about the difference in the effect of x on y? (Someone w/ 10 years vs. someone w/ 20)

Answer

A

Take the derivative, plug in the numbers and take the difference

Question 59

Q