Quantitative Methods Flashcards

1
Q

MSE Formula

A

SSE / (n-k-1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

AIC

A

n x ln(SSE/n) + 2(k+1)
Better when goal is better forecast

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

BIC

A

n x ln(SSE/n) + ln(n) x (k+1)
Better when goal is a better goodness is fit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

F-statistic

A

MSR/MSE
Assumes all slope coefficients simultaneously 0
Rejection if F>F(critical)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

F-statistic joint hypothesis

A

((SSE_R - SSE_U)/q) / (SSE_U/(n-k-1))
q is number of excluded variables in restricted model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Dickey-Fuller test for unit roots could be used to test whether the data is covariance non-stationarity. The Durbin-Watson test is used for detecting serial correlation in the residuals of trend models but cannot be used in AR models. A t-test is used to test for residual autocorrelation in AR models.

A

Test for unit root to test whether data is covariance non-stationary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Durbin-Watson

A

Test for serial correlation in residuals of trend models but cannot be used in AR models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

T-test for AR models

A

Used to test residuals in AR models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Breusch-Pagan

A

Used for conditional heteoskedasticity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Breusch-Godfrey

A

Used for positive serial correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

White-corrected standard errors

A

Used to correct for conditional heteroskedasticity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Impact of conditional heteroskedasticity (overestimate)

A

No effect on coefficient estimate
Std Err of coefficient overestimated
More type II

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Impact of conditional heteroskedasticity (underestimate)

A

No effect on coefficient estimate
Std Err of coefficient overestimated
More type II

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Newey-West Standard Errors

A

Used to correct positive serial correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Impact of serial correlation

A

No impact on coefficient estimate
Std Err underestimated
More type I error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Dummy variable misspecification

A

If we use too many dummy variables (e.g. >n-1), there will be multicollinearity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What happens to the coefficients of correlated independent variables when a new correlated variable is added to the model?

A

Adding the new variable will change the coefficient for the other correlated variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does the intercept term (b0) represent in a multiple linear regression model?

A

It shows the value of the dependent variable when all independent variables are 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What do the slope coefficients (bi) represent in a multiple linear regression model?

A

They are the estimated changes in the dependent variable for a one-unit change in the corresponding independent variable, holding all other independent variables constant. Also called partial slope coefficients.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the assumptions underlying a multiple linear regression model?

A
  1. Linearity between dependent and independent variables. 2. No significant multicollinearity. 3. Expected error is 0. 4. Homoscedasticity (constant error variance). 5. No serial correlation (errors are independent). 6. Errors are normally distributed.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How is the Total Sum of Squares (SST) calculated?

A
  1. Subtract the mean from each individual observation. 2. Square each result. 3. Sum the squared results. Degrees of freedom = n-1.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How is the Regression Sum of Squares (RSS) calculated?

A
  1. Subtract the mean from each predicted observation. 2. Square each result. 3. Sum the squared results. Degrees of freedom = k.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

How is the Sum of Squared Errors (SSE) calculated?

A
  1. Subtract the predicted value from each observed value. 2. Square each result. 3. Sum the squared results. Degrees of freedom = n-k-1.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the formula for R2 (Coefficient of Determination)?

A

R2 = SST/RSS (Explained Variation / Total Variation).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
How does Adjusted R2 differ from R2?
Adjusted R2 penalizes the addition of unnecessary independent variables, preventing overfitting. It's always less than or equal to R2. Formula: 1−[(1−R2)×(n−k−1)/(n−1)].
26
What do AIC and BIC represent, and how are they used?
Lower AIC (n×ln(SSE/n)+2(k+1)) or BIC (n×ln(SSE/n)+ln(n)(k+1)) indicates a better fitting model. They penalize adding variables, BIC more heavily. AIC is better for prediction, BIC for goodness of fit.
27
How is the F-statistic calculated and interpreted in ANOVA?
F-statistic = MSR/MSE. It compares explained variance to unexplained variance. A large F-statistic suggests significant differences between group means.
28
What is Conditional Heteroskedasticity and how does it affect statistical inference?
Error variance changes systematically with the independent variable. It can lead to underestimated standard errors (Type I errors) or overestimated standard errors (Type II errors). Detected using residual plots or Breusch-Pagan test. Corrected using White-corrected standard errors.
29
What is Serial Correlation and how does it affect statistical inference?
Error terms are not independently distributed. Positive serial correlation leads to underestimated standard errors and overestimated T/F statistics (Type I errors). Detected using residual plots, Durbin-Watson test, or Breusch-Godfrey test. Corrected using Newey-West standard errors.
30
What is Multicollinearity and how does it affect regression analysis?
Significant correlation exists between two or more independent variables. Causes unreliable coefficient estimates and inflated standard errors, leading to insignificant t-statistics (Type II errors). Detected by high pairwise correlations, insignificant t-tests with a significant F-test, or VIF > 10. Corrected by excluding problematic variables.
31
What are the principles of good model specification?
Grounded in economic reasoning, appropriate functional form, essential variables only, no violation of assumptions, tested out of sample.
32
How are qualitative independent variables incorporated into regression models?
Using dummy variables (binary 0 or 1 variables). To distinguish between n classes, use n-1 dummy variables.
33
What is Logistic Regression used for?
Used for qualitative dependent variables, modeling the probability of an event happening (between 0-1). The dependent variable is the log odds: ln(1−P/P).
34
How do you test if a time series is covariance stationary?
Check if: 1) Expected value is constant and finite. 2) Variance is constant and finite. 3) Covariance with itself for a fixed lag is constant and finite. Use the Dickey-Fuller test.
35
What is an Autoregressive (AR) model?
A time-series model that uses lagged values of itself to predict future values. Example AR(1): Yt = b0 + b1 Yt−1 + ϵt.
36
What is mean reversion in a time series?
A tendency for the series to move back towards its long-term average. Occurs in AR models when $.
37
What is a random walk process?
A time series where the predicted value in one period is equal to the value in the previous period (b1 = 1). Not covariance stationary. If b0 ≠ 0, it's a random walk with drift.
38
What is a unit root, and how is it tested?
Occurs in an AR model when b1 = 1, leading to a non-stationary (random walk) process. Tested using the Dickey-Fuller test, which tests if (b1 −1)=0.
39
How can a time series with a unit root be transformed for analysis?
By first differencing: modeling the change in the variable (Yt − Yt−1) instead of the level.
40
How is seasonality detected and corrected in time-series models?
Detected by testing for significant autocorrelation at seasonal lags. Corrected by adding a seasonal lag (another independent variable) to the AR model.
41
What is Autoregressive Conditional Heteroskedasticity (ARCH)?
When the variance of the error term in one period depends on the variance in a previous period. Tested by regressing squared residuals on lagged squared residuals: ϵ^2t = a^0 + a^1 ϵ^2t−1 + μt.
42
What happens to the coefficients of correlated independent variables when a new correlated variable is added to the model?
Adding the new variable will change the coefficient for the other correlated variables.
43
What does the intercept term (b0​) represent in a multiple linear regression model?
It shows the value of the dependent variable when all independent variables are 0.
44
What do the slope coefficients (bi​) represent in a multiple linear regression model?
They are the estimated changes in the dependent variable for a one-unit change in the corresponding independent variable, holding all other independent variables constant. Also called partial slope coefficients.
45
What are the assumptions underlying a multiple linear regression model?
1. Linearity between dependent and independent variables. 2. No significant multicollinearity. 3. Expected error is 0. 4. Homoscedasticity (constant error variance). 5. No serial correlation (errors are independent). 6. Errors are normally distributed.
46
How is the Total Sum of Squares (SST) calculated?
1. Subtract the mean from each individual observation. 2. Square each result. 3. Sum the squared results. Degrees of freedom = n-1.
47
How is the Regression Sum of Squares (RSS) calculated?
1. Subtract the mean from each predicted observation. 2. Square each result. 3. Sum the squared results. Degrees of freedom = k.
48
How is the Sum of Squared Errors (SSE) calculated?
1. Subtract the predicted value from each observed value. 2. Square each result. 3. Sum the squared results. Degrees of freedom = n-k-1.
49
What is the formula for R2 (Coefficient of Determination)?
R2=SSTRSS​ (Explained Variation / Total Variation).
50
How does Adjusted R2 differ from R2?
Adjusted R2 penalizes the addition of unnecessary independent variables, preventing overfitting. It's always less than or equal to R2. Formula: 1−[(1−R2)×(n−k−1)(n−1)​].
51
What do AIC and BIC represent, and how are they used?
Lower AIC (n×ln(SSE/n)+2(k+1)) or BIC (n×ln(SSE/n)+ln(n)(k+1)) indicates a better fitting model. They penalize adding variables, BIC more heavily. AIC is better for prediction, BIC for goodness of fit.
52
How is the F-statistic calculated and interpreted in ANOVA?
F-statistic = MSR/MSE. It compares explained variance to unexplained variance. A large F-statistic suggests significant differences between group means.
53
What is Conditional Heteroskedasticity and how does it affect statistical inference?
Error variance changes systematically with the independent variable. It can lead to underestimated standard errors (Type I errors) or overestimated standard errors (Type II errors). Detected using residual plots or Breusch-Pagan test. Corrected using White-corrected standard errors.
54
What is Serial Correlation and how does it affect statistical inference?
Error terms are not independently distributed. Positive serial correlation leads to underestimated standard errors and overestimated T/F statistics (Type I errors). Detected using residual plots, Durbin-Watson test, or Breusch-Godfrey test. Corrected using Newey-West standard errors.
55
What is Multicollinearity and how does it affect regression analysis?
Significant correlation exists between two or more independent variables. Causes unreliable coefficient estimates and inflated standard errors, leading to insignificant t-statistics (Type II errors). Detected by high pairwise correlations, insignificant t-tests with a significant F-test, or VIF > 10. Corrected by excluding problematic variables.
56
What are the principles of good model specification?
Grounded in economic reasoning, appropriate functional form, essential variables only, no violation of assumptions, tested out of sample.
57
How are qualitative independent variables incorporated into regression models?
Using dummy variables (binary 0 or 1 variables). To distinguish between n classes, use n-1 dummy variables.
58
What is Logistic Regression used for?
Used for qualitative dependent variables, modeling the probability of an event happening (between 0-1). The dependent variable is the log odds: ln(1−PP​).
59
How do you test if a time series is covariance stationary?
Check if: 1) Expected value is constant and finite. 2) Variance is constant and finite. 3) Covariance with itself for a fixed lag is constant and finite. Use the Dickey-Fuller test.
60
What is an Autoregressive (AR) model?
A time-series model that uses lagged values of itself to predict future values. Example AR(1): Yt​=b0​+b1​Yt−1​+ϵt​.
61
What is mean reversion in a time series?
A tendency for the series to move back towards its long-term average. Occurs in AR models when $\$
62
What is a random walk process?
A time series where the predicted value in one period is equal to the value in the previous period (b1​=1). Not covariance stationary. If b0​=0, it's a random walk with drift.
63
What is a unit root, and how is it tested?
Occurs in an AR model when b1​=1, leading to a non-stationary (random walk) process. Tested using the Dickey-Fuller test, which tests if (b1​−1)=0.
64
How can a time series with a unit root be transformed for analysis?
By first differencing: modeling the change in the variable (Yt​−Yt−1​) instead of the level.
65
How is seasonality detected and corrected in time-series models?
Detected by testing for significant autocorrelation at seasonal lags. Corrected by adding a seasonal lag (another independent variable) to the AR model.
66
What is Autoregressive Conditional Heteroskedasticity (ARCH)?
When the variance of the error term in one period depends on the variance in a previous period. Tested by regressing squared residuals on lagged squared residuals: ϵ^t2​=a^0​+a^1​ϵ^t−12​+μt​.
67
Prompt
Answer
68
What are the main types of machine learning?
Supervised learning (labeled data), Unsupervised learning (unlabeled data), Deep learning (neural networks with many layers), Reinforcement learning (agent learns through rewards/penalties).
69
What is overfitting in machine learning?
When a model learns the training data too well, including noise, resulting in poor performance on new, unseen data (high variance error). It often occurs with complex models or insufficient data.
70
How can overfitting be addressed?
Use validation samples, cross-validation (like K-fold), penalized regression (like LASSO), reducing model complexity, or using ensemble methods (like Random Forests).
71
Name some supervised learning algorithms and their uses.
Penalized Regression (Regression, reduces overfitting), Support Vector Machine (SVM) (Classification), K-Nearest Neighbor (KNN) (Classification), CART (Classification/Regression), Random Forest (Classification/Regression, ensemble).
72
Name some unsupervised learning algorithms and their uses.
Principal Components Analysis (PCA) (Dimension Reduction), K-Means Clustering (Clustering), Hierarchical Clustering (Clustering).
73
What is the purpose of Principal Component Analysis (PCA)?
To reduce dimensionality by summarizing correlated features into a smaller set of uncorrelated factors called principal components (eigenvectors).
74
How does K-Means clustering work?
Partitions data into 'k' clusters by iteratively assigning observations to the nearest centroid and recalculating centroids until assignments stabilize.
75
How does Hierarchical clustering work?
Builds a hierarchy of clusters. Agglomerative (bottom-up) starts with individual points and merges clusters; Divisive (top-down) starts with one cluster and splits them.
76
What are the key steps in a data analysis project?
Conceptualize model, Collect data, Prepare & Wrangle data (cleanse, transform, scale), Explore data (EDA, feature selection/engineering), Train model, Evaluate model.
77
What are common text wrangling techniques?
Tokenization, removing stop words, lowercasing, stemming, lemmatization, creating Bag-of-Words or N-grams.
78
How is model performance evaluated in classification?
Using metrics like Accuracy, Precision, Recall, F1-score, ROC curve, and AUC derived from a confusion matrix (TP, FP, TN, FN)
79
What are the main types of machine learning?
Supervised learning (labeled data), Unsupervised learning (unlabeled data), Deep learning (neural networks with many layers), Reinforcement learning (agent learns through rewards/penalties).
80
What is overfitting in machine learning?
When a model learns the training data too well, including noise, resulting in poor performance on new, unseen data (high variance error). It often occurs with complex models or insufficient data.
81
How can overfitting be addressed?
Use validation samples, cross-validation (like K-fold), penalized regression (like LASSO), reducing model complexity, or using ensemble methods (like Random Forests).
82
Name some supervised learning algorithms and their uses.
Penalized Regression (Regression, reduces overfitting), Support Vector Machine (SVM) (Classification), K-Nearest Neighbor (KNN) (Classification), CART (Classification/Regression), Random Forest (Classification/Regression, ensemble).
83
Name some unsupervised learning algorithms and their uses.
Principal Components Analysis (PCA) (Dimension Reduction), K-Means Clustering (Clustering), Hierarchical Clustering (Clustering).
84
What is the purpose of Principal Component Analysis (PCA)?
To reduce dimensionality by summarizing correlated features into a smaller set of uncorrelated factors called principal components (eigenvectors).
85
How does K-Means clustering work?
Partitions data into 'k' clusters by iteratively assigning observations to the nearest centroid and recalculating centroids until assignments stabilize.
86
How does Hierarchical clustering work?
Builds a hierarchy of clusters. Agglomerative (bottom-up) starts with individual points and merges clusters; Divisive (top-down) starts with one cluster and splits them.
87
What are the key steps in a data analysis project?
Conceptualize model, Collect data, Prepare & Wrangle data (cleanse, transform, scale), Explore data (EDA, feature selection/engineering), Train model, Evaluate model.
88
What are common text wrangling techniques?
Tokenization, removing stop words, lowercasing, stemming, lemmatization, creating Bag-of-Words or N-grams.
89
How is model performance evaluated in classification?
Using metrics like Accuracy, Precision, Recall, F1-score, ROC curve, and AUC derived from a confusion matrix (TP, FP, TN, FN).