Quantitative methods Flashcards
Multiple regression model assumptions
- linearity,
- homoskedasticity –> variance of residuals constant
- independence of errors, –> Residuals are not serially correlated
- normality, –> error term is normally distributed evaluated with QQ plot
- independence of independent variables.–> no linear relationships between independent variables
MSR
MSR = RSS/k
MSE
MSE = SSE/(n−k−1)
SST
RSS+SSE
R2
RSS/SST
oppure
(SST-SSE)/SST
oppure
(total variation – unexplained variation )/total variation
indica quanto l’indipendent variable puo spiegare
Breusch pagan
n*R^2
Adjusted R2
1-((n-1)/(n-k-1))*(1-R^2)
o measure of goodness of fit that adjusts for the number of independent variables
o adj R2<R2
o decreases when the added independent variable adds little value to regression model
Cook’s D
If observation > √(k/n)–> influential point
Odds
Prob given odds
Odds= e^coefficient
Prob with odds = odds/(1+odds)
F statistic
((SSEr-SSEu)/q) / (SSEu/(n-k-1))
=MSR/MSE with K and N-K-1 df
H0 all coefficients are zero
reject H0 if F (test-statistic) > Fc (critical value)
to explain whether at least one coefficient is significant
Conditional Heteroskedasticity
Residual variance is related to level of independent variables
- Coefficients consistent.
- St. errors underestimated
- Type I errors
DETECTION
* Breusch–Pagan chi-square test
* >5% hetero
* <5% no hetero
CORRECTIOn
robust or White-corrected standard errors
Serial Correlation
Residuals are correlated with each other
- Coefficients consistent
- St errors underestimated
- Type I errors (positive correlation)
DETECTION
* Breusch–Godfrey (BG) F-test
* Durbin Watson (DW)
* DW<2–> pos. serial corre.
CORRECtION
Use robust or Newey–West corrected standard errors
Multicollinearity
Two or more independent variables are highly correlated
- Coefficients are consistent (but unreliable).
- St errors are overestimated
- Type II errors
DETECTION
* Conflicting t and F-statistics
* variance inflation factors (VIF)
* VIF >5 o 10 problema
CORRECTION
* Drop 1 of the correl. variables
* use a different proxy for an included independent variable
MISSPECIFICATIONS
Omission of important independent variable(s)–>May lead to serial correlation or heteroskedasticity in the residuals
Inappropriate transformation / variable form–> May lead to heteroskedasticity in the residuals
Inappropriate scaling–>May lead to heteroskedasticity in the residuals or multicollinearity
Data improperly pooled
Solve it by running regression for multiple periods–May lead to heteroskedasticity or serial correlation in the residuals
prob with odds
P=(odds)/(1+odds)
Autoregressive (AR) Model
- only 1 lag–>dependent variable is regressed against previous values of itself
- no distinction between the dependent and independent variables (i.e., x is the only variable).
- USE t-test to determine whether any of the correlations between residuals at any lag are statistically significant.
- if not covariance stationary To correct add one lag at a time–> first differencing
- Ex: pattern of currency using historical price
add one lag at a time - Chain rule forecasting
Covariance Stationary
- Statistic significant = cov stationary
o Constant and finite mean. E(xt) = E(xt-1) ATTENZIONE no growth rate della mean
o Constant and finite variance.
o Constant and finite covariance - determine cov. StationaryDickey-Fuller test