Tutorial 1: Introduction & Linear Regression Model Flashcards
How does a OLS model (for all I = 1,…,N observations) look like?
y = X β + ε where….
- ….y = vertical vector = (y₁ … yₙ)’ = N x 1
- ….X = matrix of observations = N x K
- ….β = vertical vector = (β₁ … βₖ) = K x 1
- ….ε = vertical vector = (ε₁ … εₙ)’ = N x 1
What are the OLS residuals?
e = ^ε = y − ŷ = y − X β̂
Sum of squared residuals?
Σeᵢ² = e’e = (y - Xβ̂)’(y - Xβ̂)
What’s the OLS estimator β̂ₒₗₛ?
arg min [β̂] (y - Xβ̂)’(y - Xβ̂) = (X’X)⁻¹X’y
What is the OLS Estimator (conceptually)?
OLS estimator minimizes the sum of squared residuals
What is K?
of parameters = # rows in β = # columns in X
What is N?
sample size = # rows of X = # rows of Y = # rows of ε
What is the relationship between K and N in order to be able to estimate β? What if that is not the case?
k =< N
Otherwise, other estimators like lasso or ridge must
be used.
How can you interpret a coefficient?
how the linear prediction of Y changes if we increase variable x by one unit, holding the other variables fixed.
What is the variance?
Variance (σ²):
measurement of the spread between numbers in a data set -> it measures how far each number in the set is from the mean and therefore from every other number in the set
How does the variance of β̂ₒₗₛ look like (in matrix form)?
K × K variance-covariance matrix (K= number of parameters in the model):
How can β̂ₒₗₛ be rewritten in terms of β?
How can the variance of β̂ₒₗₛ be simplified?
What is the variance of the error term in a heteroskedastic OLS model?
Variance of the error term is NOT constant across i , depends on xᵢ:
What is the variance of the error term in a homoskedastic OLS model?
Variance of the error term is constant for all i , does not depend on xᵢ :