Bivariate Linear Regression Flashcards
regression model for consumption
Add ε to
Y=β₀+β₁x + ε
β₀ is intercept
β₁is MPC
ε is random variable to represent other factors influencing consumption.
When is a model linear
If it is linear in the parameters. Contains no unknown parameters
Line of best fit formula
OLS (ordinary least squares)
Use estimated regression formula
Yi=β’₀ + β’₁xi + ε’i
Rearrange this to make residuals the subject, then square
ε’²i = min Σ(Yi - β₀ - β₁Xi)²
I.e square all the deviation/residuals, then add up.
I.e if n=28, used as example.
Then, how to find minimum S
What values do we get for estimated β₀ and β₁
Differentiate β₀ and β₁ individually from
Min Σ(Yi - β₀ - β₁Xi)² and set =0 as we want to find minimum
To get…
β₀ = Ybar - β₁Xbar
β₁= …
So now we know how to estimate β₁ and β₀,
Why do we need to know properties of OLS estimators
To see if the estimates we get for β₁ and β₀ are accurate
E.g we found B1 (MPC) is 0.7765, so per every £1 increase consumption by 0.77p. Is this accurate? We have to check properties
Properties REQUIRED of OLS estimators (2)
Unbiased - centre of distribution
Efficiency - small variance
How do we know OLS has these properties?
If certain conditions are satisfied, called classic linear regression assumptions. (CLRA)
SO WE USE CLRA TO ENSURE OLS IS UNBIASED AND EFFICIENT TO BE ABLE TO ESTIMATE B0 AND B1 ACCURATELY)
Classical linear regression assumptions
- Model is written as
Y=β₀+β₁x + ε (and β₀ β₁ are unknown parameters/constants) - Explanatory variable X is fixed/non-stochastic (we can choose values of X in order to observe effects on Y
- X is not a constant. It is variable, researchers adjust to observe values of Y!
- Error (ε) has EV(mean) of 0
- No 2 errors are correlated. Except in time-series models.
- Each error has same σ² variance except in cross-sectional models. (HOMOSCEDASTIC)
- Error is normally distributed. (Mean 0, variance σ²) allows us to do hypothesis testing.
Theoretical assumption 1
Under classical linear regressions assumptions 1-4 holding,
OLS estimators are unbiased.
Theoretical result 2
Under CLRA 1-6 holding,
OLS estimator is the best linear unbiased estimator (BLUE).
(MINIMUM VARIANCE, SO MOST EFFICIENT)
Theoretical result 3
Under CLRA 1-7 holding
OLS estimator is the minimum variance unbiased estimator of linear AND non-linear estimators. (TR2 is just best LINEAR only)
Proof for TA1 : estimator of slope parameter B^₁ is unbiased
Start with this
E(β^₁) = E(β₁+Σwiεi)
Simplify using technical appendix to make
= β₁ + ΣWi E(εi)
E(β₁)=β₁ using estimator properties
CLRA2 removes ΣWi from the expectations operator (since X is fixed)
CLRA 4 - ε mean is 0, therefore it leaves E(β^₁)= β₁ therefore unbiased
Proof for why estimator of slope parameter B^₀ is unbiased
Proof for lowest variance (BLUE)
CLRA 5 and 6 needed
Learn 6* and 7* lowest variance formulas, we should end up getting that answer in the proof which proves TR2 and BLUE (lowest variance)
For estimated B₁,
Start with var (β^₁) = var (β₁ +Σwiεi)
Simplifies to var(Σwiεi) since β₁ is constant and thus no variance so =0
CLRA5 removes the correlation part to leave (as correl=0)
Σwi²var(εi)
CLRA6 - all ε has equal variances of σ²
E comes
σ²Σwi²
Then finally replace Wi with orignal X function to get 7* (would’ve got the lowest possible variance result)
Random Regresssor
Rarely we actually have fixed X (CLRA2), so we have to consider X will be a random regressor
Relationship between x and ε
We look at the value of ε given values of X (as random now)
We assume ε does not depend on value of X
How is this expressed?
E (εi | Xi) = E(εi)
means conditional on X, but since we assume ε does not depend on X, it is the same
What does this expression also show
Shows X and E are mean independent (X is strictly exogenous to E)
What happens when we apply CLRA4 to the X and ε relationship
𝐸(𝜀𝑖|𝑋𝑖) = 0
What does the zero conditional mean look like as an example
Assume ε represents innate ability and we are looking at wages.
So the average innate ability is the same regardless of given X (schooling years)
I.e whether it be 1 year or 20 years in education, ability stays the same.
However, if we think innate ability increases with schooling years, zero conditional mean does not hold.
So now we need to adjust the classical assumptions to account for the random X and the zero conditional mean assumption.
CLRA 1-6 accounting for random X
(No longer 7 assumption)
CLRA1 - the same except ε₁ is random
CLRA 2 - there IS variation in X variable (same)
CLRA 3 - Error has zero conditional mean assumption.
CLRA 4 -disturbances are conditionally uncorrelated, 𝑐𝑜𝑣(𝜀i, 𝜀j |𝑋) = 0 ,where 𝑖 and 𝑗 are time periods, because this only affects time-series models. (Same but just add |X)
CLRA 5 - Each E has finite CONDITIONAL VARIANCE 𝑣𝑎𝑟(𝜀𝑖|𝑋) = 𝜎2
CLRA 6 - E , conditional on X is normall distributed
𝜀𝑖|𝑋~𝑁(0, 𝜎2).
Coefficient of determination - goodness of fit
R squared.
Shows how much of the total variation of Y is attributable to the regression line.
When we consider the coefficient of determination , how does the regression change?
𝑌i = 𝛽^₀ + 𝛽^₁𝑋i + 𝜀̂i = 𝑌^ + 𝜀̂𝑖
Actual Y = 𝑌^ + error (Estimated value of y + deviation above or below the line)
Total sum of squares
Σ(Yi -Ybar)²
Explained sum of squares
Σ(Y^i-Ybar)²
Residual sum of squares
Σε^i²
Same thing that OLS is trying to minimise! (Square all residuals and then sum up)
R² formulas (2)
ESS/TSS
Or
1 -RSS/TSS
What is the t statistic formula to work this out
T= βhat₁ - δ
/
√σ²/Σ(Xi-Xbar)² (or SE of β₁)
~ t(n-2) since 2 unknown parameters (β₀ and β₁)
But we also need to know formula for σ²
σ²= Sum of residiuals squared / n-2
Error variance σ² formula
RSS/n-2
Confidence bands
Bhati + tcv (seβ₁)