OLS Flashcards
What is OLS + Describe a simple Regression model
OLS
- Chooses reg coeff that are close as possible to observed data
- If you imagine a scatterplot with different values, OLS will draw a regression line where it gets the least sum of squares
B0 + B1x = population regression line, the relationship that holds between y and x on average
B0 + B1 = coefficients of the population regression line
B1 Measures the marginal effect on Y for a unit change in X
u – difference between y and its predicted value
How do you estimate the coefficients in OLS?
Finding the OLS is about finding the predicted value of Y which minimizes the total squared estimation mistakes. This is also called an estimator. An estimator is a function of a sample of data to be drawn randomly form a population. Given estimates β ̂_0,β ̂_1 of β_0,β_1, we can predict y with y ̂.
What is a linear model?
Linear model means that the change in y is independent from the level of x
What is the Least Square Assumptions?
Assumption 1: The Error Term has Conditional Mean of Zero
- Error term must not show any systematic pattern
- Cannot have omitted variable biases
Assumption 2: For all n are Independently and Identically Distributed
- Independently: The variables are independent from each other. The variables does not carry information of each other. If you roll two dices, the value you get on the first dice does not affect the value you get on the second.
- Identically Distributed: Each variable in the observation is has the same probability distribution. If you have a deck of cards, the probability of drawing a diamond king is 1 in 52. All of the participants has 1 in 52 chance of drawing a king.
Main: If you flip a coin 100 times, the probability of getting heads/tails will be 50/50 for every throw (coin has no memory), so it is “Indepentent”. The probability for every throw stays the same, so it is “Identical”
Assumption 3: Large Outliers Are Unlikely
X and Y have finite kurtosis, as several outliers can give wrong estimations.
What is a type 2 error?
Failure to reject the null when the alternative is true
What is meant by asymptotic normality?
The sampling distribution of a properly normalized estimator converges to the
standard normal distribution
What is meant by asymptotic efficiency?
For consistent estimators, with asymptotically normal distributions, the
estimator with the smallest asymptotic variance
Underlying assumptions of regression analysis
Key assumptions:
Consistency: As the sample size increases, the estimates produced by the estimator
“converge” to the true value of the parameter being estimated. Increasing the sample size is
allowed because it is seen as increasing “n” closer and closer to the true population size.
Unbiasedness: A statement about the expected value of the sampling distribution of the
estimator. Is not affected by size. Only satisfied when the SR/MR 1-5 are satisfied = Gauss
Markov theorem = BLUE estimator. Unbiased equals “forventningsrett” estimator!!
Efficiency: An estimator is efficient if it gets closer to the true parameter more often than other estimators (i.e. it has lower variance around the true parameter) -> BLUE. If BLUE, there exist no other estimators that better explains the true population.
Linearity assumption: Linear in parameters: Linear in 𝛼
Normality
sed to determine if a data set is well-modeled by a normal distribution and to compute how likely it is for a random variable underlying the data set to be normally distributed.
Consistency
As the sample size increases, the estimates produced by the estimator
“converge” to the true value of the parameter being estimated. Increasing the sample size is
allowed because it is seen as increasing “n” closer and closer to the true population size.
Unbiasedness
: A statement about the expected value of the sampling distribution of the
estimator. Is not affected by size. Only satisfied when the SR/MR 1-5 are satisfied = Gauss
Markov theorem = BLUE estimator. Unbiased equals “forventningsrett” estimator!!
Efficiency
An estimator is efficient if it gets closer to the true parameter more often than other estimators (i.e. it has lower variance around the true parameter) -> BLUE. If BLUE, there exist no other estimators that better explains the true population.
What are the Least squares assumptions?
Assumption 1: The Error Term has Conditional Mean of Zero
No matter which value we chose for X, the error term u must not show any systematic pattern and must have a mean of 0. In other words, the OLS regression will on average be equal to zero. This assumption still allows for over and underestimations of Y, but the OLS estimates will fluctuate around Y’s actual value
In AM Football, the score is given by: Score = 6 * Touchdown + 1Extrapoints + 3 Field Goals +2safeties
If you ran the regression: Score = b1 * Touchdown + b2fieldgoals + e, b1 would be larger than the value of 6. The error term contains biases.
Assumption 2: For all n are Independently and Identically Distributed
This is a statement of how the sample is drawn.
All n need to be Independently Distributed. This means that the outcome of the first value in the sample, can not affect on another. The Distribution is random, and not one of the values are affecting one another – they have all their individual independence.
Identically Distributed means that the probability of any specific outcome is the same. For Example, if you flip a coin 100 times, the probability of heads will always be 50/50, and will not change throughout the experiment.
If the sampling is random, then it is representative for the population. As an example, you don’t only go to Texas if you want to research the average American income.
Assumption 3: Large Outliers Are Unlikely
X and Y have finite kurtosis, as several outliers can give wrong estimations. Large outliers will mess up our distribution and make OLS misleading
What are the assumptions in multiple regression?
1: Error Term has a conditional mean of zero
2: I.I.D
3: Large outliers unlikely
4: No perfect multicollinearity
Perfect multcollinearity is when one of the regressors is an exact linear function of the other regressors. This includes a variable that is included twice in the regression, or a dummy variable trap. Perfect multicorr uccurs if two or more regressors are perfectly correlated. In reality, we will not often see two regressors that are perfectly correlated. That is why it most often occurs from the dummy trap or by including the same regressor twice. Can use Volatility inflation factor to test if there is multicollinearity. A rule of thumb is that there is multicollinearity if VIF > 10. The solution to this problem is simply just to drop the variable
What are the Gauss-Markov assumptions?
- Parameters are linear
- IID
- No perfect multicorr
- Error term zero mean
- Homosked:
- No autocorr