Quantitative Methods Flashcards

Question 1

Q

T-Stat

F-Stat

Answer

A

T-Stat = Slope/Std error with n-k-1 degrees of freedom

F-Stat = MSR/MSE, n-k-1 df, one tailed test, reject if Fstat = Fcrit, how much does your regression represent your output vs your error representing your output

90% Significance = 1.645
95% Significance = 1.96
99% Significance = 2.58

Question 2

Q

ANOVA Table (RSS, SSE, SST, MSR, MSE, R^2, SEE)

Answer

A

Regression (RSS), k df, MSR = RSS/k
Error (SSE), n-k-1 df, MSE=SSE/n-k-1
RSS + SSE = SST
R^2 = RSS/SST, how good of a model is the regression vs the total Sum of Squares
Standard error of estimate (SEE) = sqrt(MSE), low if relationship is strong between X and Y

Question 3

Q

Linear

Log Linear

Auto Regressive (AR)

ARCH

Answer

A

Linear: y=mx+b

Log Linear: y=e^mx+b, ln(y)=mx+b

AR: X_t = mX_t-1 +b

ARCH: x_t ^2 = mX_t-1 ^2 +b

Question 4

Q

Steps for Time Series

Answer

A

1) Determine Linear/Log/AR
2) Check for autocorrelation via t-stat for AR, Durbin Watson for others
3) First difference if its there, replace x with y, y = x_t - x_t-1, changing the model to change in value of variable, this will remove a unit root and remove a trend in data
4) Correct for seasonality via adding lag variable when there is seasonality
5) Test variance for ARCH, if variance is dependent on other variance than use ARCH, make sure to correct for heteroskedasticity

Question 5

Q

Supervised Machine Learning (6)

Answer

A

1) Penalized Regression: Penalty for overfit, remove bad variables, reduces overfitting
2) Support Vector Machine: Data put in one of two buckets
3) K-Nearest Neighbor: Data classified by nearest neighbor
4) Classification Tree: Categorical Tree
5) Ensemble Learning: Multiple Dataset Model
6) Random Forest: Multiple Trees of the same set of data

Question 6

Q

Unsupervised Machine Learning (3)

Answer

A

1) Principal components analysis: Large correlated data -> Small uncorrelated data
2) K-Means clustering: Data divided into non-overlapping K clusters
3) Hierarchical clustering: Data put in hierarchy, no predefined clusters

Question 7

Q

Neural Networks

Answer

A

Neural Networks: Input/Layers/Output, layers have neurons which are either summation (average) or activation (non-linear)
Deep Learning: Many Neural Networks for more complex stuff like images
Reinforcement Learning: Learn from error to maximize defined reward

Question 8

Q

Precision, Recall, Accuracy, F-Score

Answer

A

Precision (P) = True Positives/(False Positives + True Positives), how many were actually right out of the amount you said were right

Recall (R) = True Positives/(True Positives + False Negatives), how many of the positives did you actually find

Accuracy = True Positive + True Negative/Total Data Points, how many were you correct on identifying

F1 Score = (2 * Precision * Recall) / (Precision + Recall)

Question 9

Q

Heteroskedasticity

Answer

A

It is when variance of errors is not constant, only a problem when there is conditional heteroskedasticity.

Causes Type I errors (rejected too often)

Use Breuch-Pagan or Chi-Square to detect, looking at R^2

Use White Correct standard errors to correct

Question 10

Q

Serial Correlation/Autocorrelation

Answer

A

Terms are not independent, they trend in some direction

Causes Type I errors (rejected too often)

Use Durbin Watson to detect, will be between -1 and 1 (full negative to full positive), Ho: No serial correlation

Use Hansen Method to adjust for serial correlation

Question 11

Q

Multicollinearity

Answer

A

Two variables are highly correlated so they are not independent

Solve this by removing one variable from the equation

Causes Type II errors (do not reject enough)

Question 12

Q

Unit Root

Answer

A

Detected by Dickey-Fuller or Engle-Granger test

Unit roots are bad if only one variable has one or if both variable have one and they are not cointegrated

Ok if no unit root or the variables are cointegrated

Quantitative Methods Flashcards

(12 cards)