Quantitative Methods Flashcards
Covariance Stationary
Mean and variance don’t change over time
Determined by:
- Plotting data
- Run an AR model and test correlations
- Perform Dickey Fuller Test
Note: Most economic and financial time-series are not stationary
T-Test Significance and Z-Test
90%, 95%, 99%
90%: 1.645
95%: 1.96
99%: 2.326
When can we reject the null hypothesis
If the t-stat is too big
What is covariance
How 2 variances move together
- Very sensitive when only 2 variables
- Can be negative infinite or positive infinite
Increasing Adjusted R² means _______
The added variables are worth keeping
T-Stat Formula
Coefficient / Standard Error
Model Misspecification
Types:
- Time-series: Serial correlation with a lagged variable, or forecasting the past
- Functional: Omitting a variable or data pooled improperly
Correlation Squared Purpose
Explains the variability
Correlation
cov / (std of X * std of Y)
OR
(X - Xbar)(Y-Ybar) / √(X-Xbar)²(Y-Ybar)²
MSE
SSE / n - k - 1
ANOVA Table
Source DOF SOS Mean SOS
Regression(explained) K RSS MSR = RSS/K
Error (unexplained) n-k-1 SSE MSE = SSE/n-k-1
Total n-1 SST
R² from ANOVA
Name: Coefficient of Determination
Formulas: RSS / SST
SST - SSE / SST
Correlation²
Purpose: This is the % of variability of Y explained by X’s
Analysis: The higher the better fit
Problem: Always increase as variables are added
Random Walk
Unit root: coefficient = 1
This means the null (g = 0), cannot be rejected
Does not have a mean reverting level
Not stationary
Correct by first differencing
MSR
RSS / k
RMSE
Purpose: to compare the accuracy of AR models for our-of-sample
Formula: √Average squared error
Analysis: lower the better
Multiple Regression Analysis Steps
- Is there model mispecification
- Is the t-test significant? If no, use another model
- Is the F-Stat significant? If no, use another model
- Check for Hetero, serial correlation, and multicollinearity
When using Dummy Variables
They are either on of off
Always use n-1 or it will suffer from multcollinearity
Example: If using quarters per year, use 3
Log-Linear Trend Model
Purpose: Used when there is exponential growth or there is serial correlation
Formula: y = e^b0 + b1(t)
Use the time/observations for t
Sample Correlation Coefficient Other Formulas
Covariance / (Std X * Std Y)
OR
√R²
OR
Covariance / √X * √Y
F-Stat
Use: To see if any X’s explain a significant portfolio of Y
Formula: MSR / MSE (only use when DOF is 1 or n-1)
Other forumla: (RSS/k) / (SSE / n-k-1)
SST
RSS + SSE
Sample Covariance Formula
(X-Xbar)(Y-Ybar) / n - 1
Sample Covariance Steps
- Create a table
Period 1 2 3 4
X-Xbar -.1 .2 .8 -0.9
Y-Ybar -.7 .5 1.1 -0.9 Sum
(X-Xbar)(Y-Ybar) .07 .10 .88 .86 1.86
Then take sum / n - 1
1.86 / 3 = 0.62
Multicollinearity
Purpose: High correlation among X’s (Higher than 0.7)
Detect: T-test indicate no coefficients are different from 0
Correct: Drop a variable
Effects: F-Test is significant
R² is too high
All t-stats are below 2
Sample Correlation Coefficient (R)
covariance / (sample √X) (sample √Y)
(sample √X) = sum of (X-Xbar)² / n - 1
(sample √Y) = sum of (Y-Ybar)² / n - 1
AR Models
Use previous values to get the next one. They build upon each other.
Correct if autocorrelation of residuals not significant.
Mean Reversion
b0 / (1 - b1)
Adding additional variables are best evaluated by using….
Adjusted R²
Seasonality
Purpose: Model will be misspecified unless the AR model incorporates the effects of seasonality
Detect: statistically significant lagged error term
Correct: Add an additional term (e.g. last year’s quarter)
Cointegration
Purpose: Two time-series are economically linked
Correct: Regress one variable against the other with the Dickey Fuller
Analysis: If null is rejected, they are covariance stationary
What does the T-Test Mean
Gives more confidence
Will be high if: correlation is high
Sample is high
Assumptions of Regression: Simple and Multiple
Simple
- Linear relationship between X and Y
- Expected value of error term = 0
- Variance of error term is constant (Heteroskasticity)
- Errors not serially correlated (Autocorrelation)
- Error term normally distributed
Multiple
All the above plus:
No exact linear relationship among X’s (Multicolinearity)
SEE
Name: Standard Error of Estimate (STANDARD DEVIATION)
Purpose: Gauges the fit of the regression line. Smaller the better
Formula 1: √MSE
Formula 2: √ SSE / n - k - 1
Covariance Formula
√R² * √X * √Y
Confidence Interval
coefficient +/- (critical t value * standard error)
Side note: Standard error is SEE
When is the slope coefficient significant?
When zero is not included in the range
Smallest Alpha to reject the null hypothesis? Under what value?
Answer: p-value
Under: 0.5
If under .001 then ARCH exists
In/Out of Sample Forecasts
In-sample: estimating data within the range provided
Out-of-sample: Estimating outside the range
Important b/c it proves whether the model describes the time-series
T-Test Formula for Hypothesis
Estimate - Hypothesis / Standard Error (SEE)
Limitations of Regression Analysis
- Parameter Instability
- Outliers may affect the estimated regression line
- Spurious Correlation (appearance of a linear line)
Degrees of Freedom
Simple Regression: n - 2
Multiple Regression
n - k - 1
Heteroskedasticity
Purpose: Data spread out on one tail
Types: Unconditional and conditional (only conditional has issues)
Effects: Non constant error variance
F-Test is unreliable
Biased T-Statistics
Detect: Breusch-Pagan –> Takes errors and compares to X, If R² > 0 then it exists
Correct: White-corrected Standard errors (AKA Robust)
T-Test Steps
- Use the formula to calculate the T-Test
- Look up value of t-table
- Reject if t > Tcritical or > -Tcritical
Autocorrelation
AKA: serial correlation
Purpose: correlation among error terms
Detect: Durbin Watson (does not work with AR models)
DW = 2 * (1 - correlation)
If DW is close to 2: No serial correlation
If DW is less than 2: Positively correlated
If DW is greater than 2: negatively correlated
Correct: Hansen
Adjusted R² Formula
Purpose: Eliminates the impact of additional variables
Formula: 1 - [(n-1 / n-k-1) * (1-R²)]
Note: Will always be lower than R²
Slope Coefficient and Intercept Term
Slope Coefficient: covariance / variance (or std²)
Explain: How much coefficient will move for every 1% change
Intercept Term: y - b1(x)
Explain: when X is zero
ARCH
Purpose: based on a regression of the squared residuals on their lagged values
Effects of Model Misspecification
- Coefficients are biased and inconsistent
2. Lack of confidence in hypothesis