Quant Flashcards

1
Q

Correlation & Regression

Terminology - Define the following:

  • Coefficient
  • Correlation coefficient
  • Confidence interval
A

Coefficient - a numerical or constant quantity placed before and multiplying the variable in an algebraic expression (e.g., 4 in 4x y). It is usually a number, but may be any expression. In the latter case, the variables appearing in the coefficients are often called parameters, and must be clearly distinguished from the other variables.

Correlation coefficient, r, for a sample and ρ for a population, is a measure of the strength of the linear relationship (correlation) between two variables.

A confidence interval is an interval of values that we believe includes the true parameter value, b1, with a given degree of confidence. To compute a confidence interval, we must select the significance level for the test and know the standard error of the estimated coefficient.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Correlation & Regression

Everything in simple linear regression can be applied to multiple linear regression except the three below items. Define each of the unique items in simple linear regression:

  1. Correlation coefficient (+ apply to test statistic calc)
  2. Regression assumptions
  3. Forming prediction interval for dependent (Y) variable
A
  1. The correlation coefficient

t = r√(n-2)

√(1-r2)

  • ANSWERS IS X CORRELATED TO Y?
  • (“r” for a sample and “ρ” for a population) is a measure of the strength of the linear relationship (correlation) between two variables.
  • +1 perfect positive to -1 perfect negative correlations
  • The test statistic for the significance of a correlation coefficient (null is ρ = 0) has a t-distribution with n – 2 degrees of freedom and is calculated as:
  1. Regression Assumptions
  • Linear relationship exists between the dependent and independent variables.
  • Residual term:
    • Independent variable is uncorrelated with the residual term.
    • Expected value = zero
    • Variance is constant
    • Independently distributed; that is, the residual term for one observation is not correlated with that of another observation (a violation of this assumption is called autocorrelation).
    • Normally distributed.
  • <span>Note that five of the six assumptions are related to the residual term. The residual terms are independently (of each other and the independent variable), identically, and normally distributed with a zero mean.</span>
  1. Confidence Interval for a Predicted Y-Value (applies to multiple but DN2K for test)
  • In simple linear regression, you have to know how to calculate a confidence interval for the predicted Y value:
  • Confidence interval = predicted Y value ± (critical t-value)(standard error of forecast)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Correlation & Regression

R2a

(Adjusted R2)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Correlation & Regression

What is P-Value and how is it used in hypothesis testing?q

A

P-value is the smallest level of significance for which the null hypothesis can be rejected. An alternative method of doing hypothesis testing of the coefficients is to compare the p-value to the significance level:

  • P-value < less than significance level, reject null
  • P-value > greater than significance level then cannot rejcect null
  • Remember: small Ps and big Ts to reject the null!
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Correlation & Regression

Coefficient of Determination, R2

A

RSS

R2 = SST

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Correlation & Regression

F-Statistic

A
  • F-test assesses the effectiveness of the model as a whole in explaining the dependent variable.
  • Assesses how well the set of independent variables, as a group, explains the variation in the dependent variable. That is, the F-statistic is used to test whether at least one of the independent variables explains a significant portion of the variation of the dependent variable.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Correlation & Regression

Multiple Regression flow chart of issues to know for exam.

A
  • t-test assesses the statistical significance of the individual regression parameters,
  • F-test assesses the effectiveness of the model as a whole in explaining the dependent variable.
  • understand the effect that heteroskedasticity, serial correlation, and multicollinearity have on regression results.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Regression analysis

Define each of the following problems in regression analysis, the effects, how its identified and corrected.

  1. Conditional Heteroskedasticity
  2. Serial Correlation
  3. Multicollinearity
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Regression

ANOVA table for multiple regression

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Correlation and Regression

What are the six types of Model Misspecification and what’s their impact?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly