Quantitative Methods Flashcards
When should you use Logistic regression models?
If the dependent Y variable is discrete
If out independent X variables is qualitative
When should you use Multiple regression models?
When the dependent variable is continuous (not discrete) and there is more than one explanatory variable (more than one dependent variable).
When multiple independent variables determine the outcome of a single dependent variable.
- Dependent Y Variable is continuous
- We have more than 1 Dependent Y variable
Assumption of Regression models
L.I.I.N.H.
Linearity: Relationship between dependent Y variable and Independent X variable is linear.
Independent of Errors: Regression residuals are uncorrelated across observation.
Independent: Independent X variable is not random, there is no exact linear relationship between 2 or more independent variables.
Normality: Regression residuals are normally distributed.
Homoscedasticity: Constant variance of regression residuals
How to determine if a variable is significant?
|T-Stat| > 1
Degrees of freedom for SSR
N-k
Degrees of freedom for SST
N-1
Degrees of freedom for SSE
N-K+1
What will happen to adjusted R-Square if we have insignificant varibles
Adjusted R-Square decreases
R-Square formula
SSR/SST = Explained Variation / Unexplained variation
1-(unexplained variation/total variation)
What kind of test is this?
H0: bi = Bi
Ha: bi /= Bi
Two tail test
What kind of test is this?
H0: bi <= Bi
Ha: bi > Bi
Right tail test
<= - is heading right
What kind of test is this?
H0: bi => Bi
Ha: bi < Bi
Left tail test
=> is heading left
Model Misspecification - Omitted variable
If we omit a significant variable from our model, the error term will capture the missing.
Model Misspecification - Inappropriate form of variable
Failing to account for non-linearity
Causes: Conditional heteroscedasticity
To fix it we can use natural log to transform the variable to be linear.
Model Misspecification - Inappropriate Scaling
Causes Conditional heteroscedasticity and multicollinearity
Model Misspecification - Inappropriate Pooling of Data
Causes Conditional heteroscedasticity and Serial correlation
What is Unconditional heteroscedasticity
Var(error) not correlated with independent variable.
No issue with interference.
What is Conditional heteroscedasticity
Var(error) are correlated with independent X variable
F-test is unreliable since MSE is a biased estimator of the true population variance.
variance at one time step has a positive relationship with variance at one or more previous time steps. This implies that periods of high variability will tend to follow periods of high variability and periods of low variability will tend to follow periods of low variability.
What does the Breusch Pagan BP tets do?
Tests for heteroskedasticity
The formula for BP test statistics
n * R-Square
BP test
Test statistics > Critical value
Reject the null.
No heteroskedasticity
homoskedasticity is present -* Constant vartiance *
- H0: No heteroskedasticity - homoskedasticity is present
- Ha: Heteroskedasticity
BP test
Test statistics < Critical value
Reject the null
There is Heteroskedasticity
H0: No heteroskedasticity
Ha: Heteroskedasticity
What is serial correlation?
Errors correlated across the observation
Positive Serial Correlation
Positive residuals is most likely followed by positive residuals
Negative residuals is most likely followed by negative residuals