2: Introduction to Simple Regression Flashcards
y
independent variable
x
dependent variable
simple regression
yi=a^ + B^ xi + u^ i
describes the linear relationship between y and x
use OLS to find the fitted line which represents linear relationship
Ordinary least squares objective
to minimum the aggregate error terms, residuals
the best fitted line, will have the smallest amount of errors
how to measure aggregate errors
we use the sum of the squared residual.
- we do not use the normal residuals because some are positive and some are negative because some of the values are above the line and some are below. therefore, they cancel each other
- we do not use the absolute value because they cannot be differentiated for optimisation
why do we use the sample values instead of the population?
- attaining the population values is timely and expensive
- we use the sample values to get sample estimation and infer true values based upon them
- it is necessary to know the standard errors that measure the uncertainty of the sample estimators
how to calculate ordinary least squares
take each distance and square it and minimise the total sum of squares
what does the error term contain
- omitted variables (other independent variables that differ with y)
- measurement errors (the difference between the measured variable and the actual value)
- incorrect functional form (wrong model used)
- random component (human behaviour is random)
population regression function
description of the model that is thought to be generating the actual data and the true relationship between variables
- better than SRF (gives the correct relationship)
sample regression function
relationship that has been estimated using sample observations