1. Simple Linear Regression Model Flashcards
What can stats methods be used for?
- testing economic theories
- estimating the magnitude of relationships
- forecasting
- policy evaluation
7 steps for econometric
- Formulate the question
- Develop economic model
- Specify the model
- Set out hypotheses
- Estimate the economic model
- Conduct hypothesis test
- Interpret results and draw conclusions
Types of data
- cross sectional data
- time series data
- pooled cross sectional data
- panel (longitudinal) data
What does the simple linear regression model look like?
y= B0+ B1x+ u
What does u represent?
Factors other than x which aftect y
Assumptions about u in simple linear regression model
- On average the disturbance term is zero. E(u)=0. For any single observation it will be positive or negative
- The disturbances are unrelated to the explanatory variable E(u|x)=E(u)
Why is the zero conditional mean assumption important?
If E(u|x) =0 then E(y|x)=B0+ B1x
What is the conditional mean assumption?
That E(u|x)=0
Population regression function
A linear function of x where a one unit increase in x changes y by P. For any value of x, the distribution of y is centred about E(y|x)
What does PRF tell us?
How the expected value of y changes with x
What doesn’t the PRF tell us?
y=B0 + B1x for every observation. y is not always equal to E(y|x)
OLS
Ordinary Least Squares
OLS advantages
Works well
Simple
What does OLS do?
Finds values of B0 and B1 that minimise the sun of the squared vertical distances between the points and the line
What is another name for the OLS regression line?
SRF
How does the SRF relate to the PRF?
The SRF is an estimate of the PRF
Properties of OLS estimators
- unbiased
* efficient
What are the two parts of the estimator of B1?
- a non random (deterministic) part that captures the true underlying relationship
- a random (stochastic) part responsible for variations around the population parameter
What is the sampling distribution of the estimator of B1?
If the estimator of B1 is unbiased then it is normally distributed around B1 as the expected value
Assumptions for unbiasedness of OLS parameters
- Linear in parameters
- Random sampling
- Sample variation in explanatory variable
- Zero conditional mean
What happens to the estimate of B1 if there is a positive covariance of x and u?
The expected value of the estimate of B1 is greater than B1 so the estimate is biased upwards
How realistic is the assumption of linear in parameters?
- It is a good first approximation.
- Non linear relationships are quite common but they can be accommodated for
How realistic is the assumption of random sampling?
- It isn’t always valid
- important to understand how data was generated
- we will assume random sampling
How realistic is the assumption of sample variation in explanatory variable?
- Almost always valid
- It is possible x won’t vary much, this will affect the accuracy of our estimates
How realistic is the zero conditional mean assumption?
-It often isn’t valid usually because factors affecting y and correlated with x have been omitted from the model
What does OLS estimators being efficient mean?
They are the best linear unbiased estimators, they have the smallest dispersion and the minimum variance
Homoskedasticity
The assumption that the error U has the same variance given any value of the explanatory variable Var(u|x)= ó^2
Heteroskedastic
When the var(u|x) varies as x varies so it isn’t equal to a constant
How realistic is the assumption of homoskedasticity?
Not very realistic. Typically the greater x, the greater the variance
What factors determine our estimate of the variance of B1
- As n (sample size) increases, variance decreases
- if the variance of u increases then so will the variance of our estimate of B1
- more variation in x will decrease the variance
What are the standard errors?
They are the standard deviation of our estimates of B0 and B1
Properties of standard errors
- unbiased
* random variables that take a different value for each sample of data
Standard error of the regression (root mean square)
The standard deviation of u. This is a measure of the accuracy of the measure as a whole
How do OLS estimators minimise the standard error of the regression?
OLS estimators minimise the sum of the squared residuals. This makes up part of the standard error of the regression so if this is minimised so will the standard error of the regression