Lecture 3 Flashcards
Ordinary Least Squares (OLS)
A mathematical technique to best represent the linear relation between X&Y
Bivariate regression
When there are only 2 variables involved (IV & DV)
Ceteris paribus
When the IV changes, but all the other variables remain constant
The coefficient of each IV
Indicates change in the DV
R2
The portion of explained variance in the regression, determines the goodness of fit of the models.
Between 0, no prediction and 1 perfect prediction. The higher the better
Does not have to be significant in order to accept the hypothesis.
Adjusted R2
R2 adjusted for the number of variables, better way to determine the best test than R2
Control variable
Something that is constant and unchanged in an experiment in order to asses or clarify the relationship between 2 variables.
Dummy variable
Takes on the value 0 or 1,
With intercept, use m-1 dummies, the omitted one is the base/reference category
Variable trap
Is you use two dummies in stead of 1, you will have a perfect linear relationship
F-test
Overall significance of the model. Tests whether all parameters are jointly 0.
Does your model makes sense overall? If your f-statistic is not significant, you should not interpret your model.
If your f-statistic is significant, the coefficients are not 0, so you can interpret them.
Does my main IV have a significant influence on the DV?
- Formulte the H0 and H1
- Choose the significance level (e.g. 0.05)
- Check the perspective p-value and check it with the significance level
- P-value < significance level: H0 should be rejected
Winsorizing variables
The transformation of statistics by eliminating extreme values in the statistical data to reduce the effect of possible spurious outliers.
It replaces the smallest and the largest variables with the values closest to them, then you can calculate the winsorized mean.