Quantitative Methods Flashcards
What is the sample correlation coefficient (r) for 2 variables?
r = Cov(X,Y) / SD X * SD Y
What is the t-test formula?
t = (r * √(n-2)) / (√1 – r^2)
What is the b formula?
b = Cov(X,Y) / Var X
How to interpret t-test value?
If calculated test statistic has higher absolute value than critical value, the value is significant.
What are the six assumptions of the classic normal linear regression model?
1) Linear relation exists between dependent and independent variables
2) Independent variable is not random
3) Expected value of error term in 0
4) Variance of error term is the same for all observations (homoskedasticity)
5) Error term is uncorrelated across observations
6) Error term is normally distributed
What does the standard error of estimate (SEE) measure?
How well the regression model fits the data. If SEE is small, the model fits well.
What is the formula for SEE?
SEE = ( (Unexplained variation / (n -2 ) )^0.5
What is the coefficient of determination and what is the formula
r^2 = (Total variation - Unexplained variation) / Total variation
What is the formula for F-statistic?
F-statistic = Regression MSS / Residual MSS
What is the formula for sample variance of dependent variable?
Total variation / (n-1)
How to calculate the interval confidence?
Coefficient +/- α * Standard Error
What is the formula for the f-test?
f-test = (RSS/k) / (SEE / (n-(k+1))) or MSR/MSE
In a multiple linear regression model, what is the t-test and how can we interpret its result?
t = b - 0 / Standard Error
The lower the p-value, the more significant the result
What are the six assumptions of the classical normal multiple linear regression model?
1) Linear relation exists between dependent and independent variables
2) Independent variables are not random. No exact linear relation exists between 2 or more independent variables.
3) Expected value of error term in 0
4) Variance of error term is the same for all observations (homoskedasticity)
5) Error term is uncorrelated across observations
6) Error term is normally distributed
When predicting the dependent variable using a linear regression model, what are the two types of uncertainty we encounter?
1) Uncertainty in the regression model itself (SEE)
2) Uncertainty about the estimates of the regression coefficients
What is the formula for adjusted R^2
Adjusted R^2 = 1 - ((n-1)/(n-k-1)) (1 - R^2)
What is conditional heteroskedasticity?
1) Variance of the errors differs across observations: error term is correlated with the values of the independent variables
2) F-Test is unreliable
3) SEE are underestimated and t-stats are inflated
4) If ignored, we tend to find significant relationships when none actually exists
How to correct for heteroskedasticity?
1) Computing robust standard errors
2) Generalized least squares (modifies original equation)