Session 3 - Quantitative Methods Flashcards
Covariance
Statistical measure of the degree to which 2 variables move together. Infinite range.
= [(X - mean of X)(Y - mean of Y)] / n - 1
Correlation Coefficient
A measure of the strength of the linear relationship (correlation) between 2 variables. Range of -1 to 1.
= (covariance of X and Y) / [(sample SD of X)(sample SD of Y)]
T-test
Used to determine if a correlation coefficient, r, is statistically significant.
= r√(n- 2) / √(1-r²)
Slope Coefficient
the change in the dependent variable for a 1 unit change in the independent variable.
= covariance / variance
Sum of Squared Errors (SSE)
The sum of the squared vertical distances between the estimated and actual Y-values.
Standard Error of Estimate (SEE)
Gauges the fit of the regression line. Smaller error = better fit.
= √(SSE/n-2)
Regression Sum of Squares (RSS)
Measures the variation in the dependent variable that is explained by the independent variable.
It is the sum of the squared vertical distances between the actual Y-values and the predicted Y-values.
Total Sum of Squares (SST)
Measures the total variation in the dependent variable. It is equal to the sum of the squared differences between the actual Y-values and the mean of Y.
Coefficient of Determination (R²)
The % of the total variation in the dependent variable explained by the independent variable.
= RSS/SST
= (SST - SSE)/SST
Total Variation (ANOVA)
= Explained variation (RSS) + Unexplained variation (SSE)
F-statistic
Assesses how well a set of independent variables, as a group, explains the variation in the dependent variable.
= (RSS/k) / (SSE/n-k-1)
where k = # of independent variables
P-value
The smallest level of significance for which the null hypothesis can be rejected.
An alternative method of doing hypothesis testing of the coefficients is to compare the p-value to the significance level.
If p-value is less than the significance level, the null hypothesis can be rejected.
Confidence Interval
Estimated regression coefficient +/- (critical t-value)(coefficient standard error)
Adjusted R²
Overcomes the problem of overestimating the impact of additional variables on the explanatory power of a regression model.
Dummy Variables
Independent variables that are binary in nature. They are often used to quantify the impact of qualitative events. They are assigned a value of 0 or 1.