Tutorial 1: Introduction & Linear Regression Model Flashcards

1
Q

How does a OLS model (for all I = 1,…,N observations) look like?

A

y = X β + ε where….

  • ….y = vertical vector = (y₁ … yₙ)’ = N x 1
  • ….X = matrix of observations = N x K
  • ….β = vertical vector = (β₁ … βₖ) = K x 1
  • ….ε = vertical vector = (ε₁ … εₙ)’ = N x 1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the OLS residuals?

A

e = ^ε = y − ŷ = y − X β̂

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Sum of squared residuals?

A

Σeᵢ² = e’e = (y - Xβ̂)’(y - Xβ̂)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What’s the OLS estimator β̂ₒₗₛ?

A

arg min [β̂] (y - Xβ̂)’(y - Xβ̂) = (X’X)⁻¹X’y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the OLS Estimator (conceptually)?

A

OLS estimator minimizes the sum of squared residuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is K?

A

of parameters = # rows in β = # columns in X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is N?

A

sample size = # rows of X = # rows of Y = # rows of ε

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the relationship between K and N in order to be able to estimate β? What if that is not the case?

A

k =< N

Otherwise, other estimators like lasso or ridge must
be used.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How can you interpret a coefficient?

A

how the linear prediction of Y changes if we increase variable x by one unit, holding the other variables fixed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the variance?

A

Variance (σ²):

measurement of the spread between numbers in a data set -> it measures how far each number in the set is from the mean and therefore from every other number in the set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does the variance of β̂ₒₗₛ look like (in matrix form)?

A

K × K variance-covariance matrix (K= number of parameters in the model):

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can β̂ₒₗₛ be rewritten in terms of β?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How can the variance of β̂ₒₗₛ be simplified?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the variance of the error term in a heteroskedastic OLS model?

A

Variance of the error term is NOT constant across i , depends on xᵢ:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the variance of the error term in a homoskedastic OLS model?

A

Variance of the error term is constant for all i , does not depend on xᵢ :

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What condition do we assume to hold for both homoskedastic and heteroskedastic models?

A

We always assume that error terms of two observations are not correlated:

17
Q

What are the consequences of heteroskedasticity?

A

Consequences of heteroskedasticity:

  1. if the other OLS assumptions are fulfilled, ˆβ is still unbiased + consistent
  2. ˆβ is not efficient (not BLUE any more), others like GLS will be better (= more efficient)
  3. non-robust estimator Var(ˆβ|x) is not consistent estimator of Var(β|x) any more
18
Q

How can the variance of ˆβ be simplified with homoskedasticity?

A
19
Q

What is a consistent estimator of the variance of ˆβ under homoskedasticity?

A
20
Q

What are solutions to heteroskedasticity?

A
  1. Heteroskedasticity-robust standard errors: only ˆβ will then still not be the most efficient estimator
  2. GLS (= OLS on transformed data)
21
Q

How does heteroskedasticity-robust variance look like?

A

can not be simplied further. The non-robust standard errors will be inconsistent!

22
Q

What is the sandwhich estimator?

A
  • proposed by White (1980)
  • based on squared OLS residuals eᵢ²
  • consistent under all forms of heteroskedasticity (also homoskedasticty!)
  • = “hetereoskedasticity-robust standard errors”’
23
Q

What does it mean for an estimator to be consistent?

A

An estimator is consistent if, as the sample size increases, the estimates (produced by the estimator) “converge” to the true value of the parameter being estimated. To be slightly more precise - consistency means that, as the sample size increases, the sampling distribution of the estimator becomes increasingly concentrated at the true parameter value.

24
Q

What does it mean for an estimator to be unbiased?

A

An estimator is unbiased if, on average, it hits the true parameter value. That is, the mean of the sampling distribution of the estimator is equal to the true parameter value

25
Q

How can you conduct graphical analysis of heteroskedasticity?

A

Graphical analysis:

  • Plot standardized/studentized residuals against explanatory variables
  • In a multivariate model, also plot the standardized / studentized residuals against fitted values
26
Q

How can you detect heteroskedasticity?

A

Graphical analysis or White test

27
Q

What does the White test test?

A
28
Q

What does a hypothesis test for a single hypothesis test?

A
29
Q

How do you conduct the White test?

A
  1. Take the squared residuals from a usual OLS regression eᵢ².
  2. Regress eᵢ² on all linear, squared and interaction terms of all covariates
  3. Calculate the test statistic = N * R² taking the R² from the auxiliary regression
  4. Under H₀, the test statistic is asymptotically Chi-square distributed with q degrees of freedom. q= number of regressors in the auxiliary regression minus 1
30
Q

What is the t-statistic for a hypothesis test for a single hypothesis and how is it distributed?

A

the t-statistic is asymptotically t-distributed with N − K degrees of freedom

(N= number of observations, K= number of model parameters incl. constant)

31
Q

What does a hypothesis test for a single hypothesis on multiple parameters test?

A
32
Q

What is the t-statistic for a hypothesis test for a single hypothesis with multiple parameters and how is it distributed?

A
33
Q

What does a hypothesis test for multiple hypotheses on test?

A

where R is a J × K-matrix of restrictions, beta the K × 1-vector of parameters, and r is a J × 1-vector:

34
Q

What is the test statistic for a hypothesis test for multiple hypotheses? (where R is a J × K-matrix of restrictions, the K × 1-vector of parameters, and r is a J × 1-vector.)

A

Asymptotically F-distributed with J, N − K degrees of
freedom. V^ar(beta^) can be the robust or non-robust estimator for the variance-covariance matrix of beta^ .