Quantitative Methods Flashcards

1
Q

T-Stat

F-Stat

A

T-Stat = Slope/Std error with n-k-1 degrees of freedom

F-Stat = MSR/MSE, n-k-1 df, one tailed test, reject if Fstat = Fcrit, how much does your regression represent your output vs your error representing your output

90% Significance = 1.645
95% Significance = 1.96
99% Significance = 2.58

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

ANOVA Table (RSS, SSE, SST, MSR, MSE, R^2, SEE)

A

Regression (RSS), k df, MSR = RSS/k
Error (SSE), n-k-1 df, MSE=SSE/n-k-1
RSS + SSE = SST
R^2 = RSS/SST, how good of a model is the regression vs the total Sum of Squares
Standard error of estimate (SEE) = sqrt(MSE), low if relationship is strong between X and Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Linear

Log Linear

Auto Regressive (AR)

ARCH

A

Linear: y=mx+b

Log Linear: y=e^mx+b, ln(y)=mx+b

AR: X_t = mX_t-1 +b

ARCH: x_t ^2 = mX_t-1 ^2 +b

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Steps for Time Series

A

1) Determine Linear/Log/AR
2) Check for autocorrelation via t-stat for AR, Durbin Watson for others
3) First difference if its there, replace x with y, y = x_t - x_t-1, changing the model to change in value of variable, this will remove a unit root and remove a trend in data
4) Correct for seasonality via adding lag variable when there is seasonality
5) Test variance for ARCH, if variance is dependent on other variance than use ARCH, make sure to correct for heteroskedasticity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Supervised Machine Learning (6)

A

1) Penalized Regression: Penalty for overfit, remove bad variables, reduces overfitting
2) Support Vector Machine: Data put in one of two buckets
3) K-Nearest Neighbor: Data classified by nearest neighbor
4) Classification Tree: Categorical Tree
5) Ensemble Learning: Multiple Dataset Model
6) Random Forest: Multiple Trees of the same set of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Unsupervised Machine Learning (3)

A

1) Principal components analysis: Large correlated data -> Small uncorrelated data
2) K-Means clustering: Data divided into non-overlapping K clusters
3) Hierarchical clustering: Data put in hierarchy, no predefined clusters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Neural Networks

A

Neural Networks: Input/Layers/Output, layers have neurons which are either summation (average) or activation (non-linear)
Deep Learning: Many Neural Networks for more complex stuff like images
Reinforcement Learning: Learn from error to maximize defined reward

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Precision, Recall, Accuracy, F-Score

A

Precision (P) = True Positives/(False Positives + True Positives), how many were actually right out of the amount you said were right

Recall (R) = True Positives/(True Positives + False Negatives), how many of the positives did you actually find

Accuracy = True Positive + True Negative/Total Data Points, how many were you correct on identifying

F1 Score = (2 * Precision * Recall) / (Precision + Recall)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Heteroskedasticity

A

It is when variance of errors is not constant, only a problem when there is conditional heteroskedasticity.

Causes Type I errors (rejected too often)

Use Breuch-Pagan or Chi-Square to detect, looking at R^2

Use White Correct standard errors to correct

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Serial Correlation/Autocorrelation

A

Terms are not independent, they trend in some direction

Causes Type I errors (rejected too often)

Use Durbin Watson to detect, will be between -1 and 1 (full negative to full positive), Ho: No serial correlation

Use Hansen Method to adjust for serial correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Multicollinearity

A

Two variables are highly correlated so they are not independent

Solve this by removing one variable from the equation

Causes Type II errors (do not reject enough)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Unit Root

A

Detected by Dickey-Fuller or Engle-Granger test

Unit roots are bad if only one variable has one or if both variable have one and they are not cointegrated

Ok if no unit root or the variables are cointegrated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly