07 Linear Regression Flashcards
σ
Population standard deviation
s
Sample standard deviation:
An estimator of the population standard deviation
s_y
s_y is the estimate of the population standard deviation for the random variable Y of the population from which the sample was drawn
SE()
Standard error of an estimator:
An estimator of the standard deviation of the estimator
SE( ̄Y) = ˆσ_ ̄Y = s_y / √n
μ
Population mean
u
all other factors than X that affects Y
synonyms for “dependent variable” and “independent variable”
dependent variable vs. independent variable
explained variable vs. explanatory variable
predicted variable vs. control variable
response variable vs. control variable
regressand vs. regressor
A normally distributed variable (X) can be made standard normal by:
Z = (X - μ) / ( σ / root(n))
The sample average is normally distributed whenever:
- Xi is normally distributed
- n is large (CLT)
T variable
T = (X - μ) / (s_x / root(n))
SLRM
Simple Linear Regression Model
The sum of squared prediction mistakes over all n observations
sum[(Y - E(β0) - E(β1)X)^2]
E(β0)
E(β0) = avg(Y) - E(β1) * avg(X)
Given by derivation++ of sum[(Y - E(β0) - E(β1)X)^2]
E(β1)
E(β1) = sum[(X - avg(X)) (Y - avg(Y))] /
sum[(X - avg(X)^2]
Given by derivation++ of sum[(Y - E(β0) - E(β1)X)^2]
E(β1) = r_{XY} * s_Y / s_X
If uˆi is positive, the line ____ Yi
If uˆi is positive, the line underpredicts Yi
By the definition of uˆ and the first OLS first order condition the sum of the prediction error is …
By the definition of uˆ and the first OLS first order condition the sum of the prediction error is zero
Sum(û_i) = 0
The sample covariance between the independent variable and the OLS residuals is …
The sample covariance between the independent variable and the OLS residuals is zero.
The point … is always on the regression line (OLS)
The point (X ̄ , Y ̄) is always on the regression line (OLS)
different goals of regression
Among others:
- Describe data set
- Predictions and forecasts
- Estimate causal effect
Causality
Causality is the effect measured in an ideal randomized controlled experiment
The OLS estimator is unbiased, consistent and has asymptotically normal sampling distribution if:
The OLS estimator is unbiased, consistent and has asymptotically normal sampling distribution if:
- Random sampling
- Large outliers are unlikely
- The conditional mean of u_i given X_i is 0:
E (u|X ) = 0
E(abil | educ = 8) = E(abil | educ = 16).
The OLS estimator is ___, ____ and has _____ if:
- Random sampling
- Large outliers are unlikely
- The conditional mean of u_i given X_i is 0
The OLS estimator is unbiased, consistent and has asymptotically normal sampling distribution if:
- Random sampling
- Large outliers are unlikely
- The conditional mean of u_i given X_i is 0
(OLS) When dealing with outliers one may want ______
When dealing with outliers one may want to report the OLS regression both with and without the outliers
OLS is the most e cient (the one with the lowest variance) among all linear unbiased estimators whenever:
OLS is the most ecient (the one with the lowest variance) among all linear unbiased estimators whenever:
- The 3 OLS assumptions hold
- The error is homoskedastic