3. Quantitative Methods Flashcards
Sum of Squared Errors (Definition)
Difference between Yi and Ŷ (observation and estimate)
Sum of Squared Regression (Definition)
Difference between Ŷ and Mean of Y (regression and best descriptive estimator)
SSR (#Degrees of Freedom)
k (# parameters of X estimated in the regression)
SSE (#Degrees of Freedom)
n-k-1 (N - estimators of X - intercept)
SST (#Degrees of Freedom)
(n-1)
Mean Squared of Regression (Formula)
MSR = SSR/k
Mean Squared of Error (Formula)
MSE = SSE/(n-k-1)
Squared Error of Estimate (SEE Formula)
SSE = √MSE
The lower, the more accurate the model is
F Test
F = MSR / MSE (testar a diferença entre a regressão em comparação com o erro)
DF @ K numerator (horizontal)
DF @ N-K-1 denominator (vertical)
Regression Assumptions
- Linearity
- Homoscedasticity (var ε same across observations). Muita ou pouca VOL.
- Pairs X and Y are independent (if not, there is serial correlation)
- a. Residuals are independently distributed
- b. Residuals’ distribution is Normal
B0 (Intercept Test)
T-Test = (B1 est - B1 hipótese) / SB1
One-tail or Two-tails @ df = (n-k-1), as I am using error as a denominator
Sb1 = SEE / Sum of Sqaures of (Obs X - Mean X)
Dummy Variable
Y = b0 + b1*Dummy
Dummy = 0 or 1
If Dummy = 0, then Y = b0 = mean
If Dummy = 1, then Y = b0 + b1
Confidence Interval (Formula)
Interval = Ŷ ± T-Critical * Sf
Ŷ = Calculate using regression Sf = Std Error of Forecast = Sf = SEE² * [1 + 1/n * [(X-Mean)²/(n-1*Sx²)]
R² (Formula)
R² = SSR/SST = Measure of Fit
Regression Types
- Log-Lin = lnY = b0+b1X1
- Log-Log = lnY = b0 + b1*(lnX1)
- Lin-Log = Y = b0 + b1*(lnX1)
Multiple Regression Assumptions
- X e Y are linear
- IVs (X) are not random
- E (ε | X1, X2, Xk) = 0
- E (ε²) = Variância e é igual para todas as observações
- E (erro1, erro2) = 0, erro não é correlacionado
- Erro é distr. ~N
F-statistic for Multiple (Hypothesis)
H0: B1 = B2 = B3 = 0
H1: At least one ≠ 0
One-Tailed Test @
DF Numerator = K = Horizontal
DF Denominator = (N-K-1) = Vertical
R² Adjusted (Formula)
Adj. R² = 1 - [(n-1)/(n-k-1)] * [1-R²]
Multicolinearity (Definition)
B1 e B2 t-tests are not relevant, but F-test is
Reason: Two IVs are highly correlated
Detection: ↑ R² and ↑ F-test, but ↓ B0
Correction: Omit one variable
Consequence: ↑ SE = ↓ F test
Heteroskedasticity (Definition)
Var of ε changes across observations
Unconditional: Var (ε) NOT correlated w/ IVs
Conditional: Var (ε) IS correlated w/ IVs
Correction:
- Robust Std Errors
- Generalized Least Squares
Heteroskedasticity (Test)
Breusch Pagan Test (OH NO)
H0: NO conditional
H1: Conditional
Test = n * R²*ε @ Chi Squared Table
Regress the error on the IVs
Hansen Method (Definition)
Preferred if (i) SC or (ii) SC + Heteroskedasticity
Serial Correlation (Definition)
- Errors are explained by similar reasons
- ↓ SEE = ↑ F-test
- Violates Independence of Pairs (X and Y)
- Se o erro anterior é positivo, chance do erro seguinte ser positivo é de fato mais alta
- If IV = Y lagged, then B0 will not be valid
Test for Serial Correlation
Durbin Watson (Deutsche Welle)
H0: DW = 2 (No Correl)
H1: DW ≠ 2 (Correl)
Test = 2*(1-r) DF = K and N items
Correction: (i) Modified SEs,
(ii) Modify Regression Equation
(iii) Include seasonal term
Hansen or White Method (Criteria)
If only Hetero: White SEs
If only SC: Hansen
If both: Hansen is preferred
Standard Error of Residuals (Formula)
SE Residuals = 1 /√T, where T = # observations
Misspecifications of Model (List)
- Data Mining
- Functional Form (Linear, Log, Diff Samples)
- Parsimonious IVs
- Examine violations before accepting
- Tested out of sample
Logit Regressions
Ln (Odds) = B0 + B1X1 + BnXn + ε
Estima a máxima chance de o sample ter acontecido
Slope = Chg Log Odds of event happening