4. Linear Regression Flashcards

1
Q

Type II Error

A

Fail to reject a null that should be rejected; false negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explanation of Assumption 6

A

Assumption 6, that the error term is normally distributed, allows us to easily test a particular hypothesis about a linear regression model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Effect on Size of Interval When Increasing Confidence

A

Confidence interval will expand

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Explanation of Assumption 4

A

Assumption 4, that the variance of the error term is the same for all observations, is also known as the homoskedasticity assumption. The reading on multiple regression discusses how to test for and correct violations of this assumption.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Necessity of Assumption 2 and 3

A

Assumptions 2 and 3 ensure that linear regression produces the correct estimates of b0 and b1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Hypothesis testing

A

A way for to test the results of a survey or experiment to see if you have meaningful results. Basically testing whether your results are valid by figuring out the odds that your results have happened by chance. If your results may have happened by chance, the experiment won’t be repeatable and so has little use.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Type I Error

A

Reject a null that should not be rejected; false positive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Dependent variable

A

The variable whose variation about its mean is to be explained by the regression; the left-hand-side variable in a regression equation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Standard error of estimate/Standard error of the regression

A

Like the standard deviation for a single variable, except that it measures the standard deviation of the residual term in the regression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Necessity of Assumption 1

A

Assumption 1 is critical for a valid linear regression. If the relationship between the independent and dependent variables is nonlinear in the parameters, then estimating that relation with a linear regression model will produce invalid results. For example, is nonlinear in b1, so we could not apply the linear regression model to it. Even if the dependent variable is nonlinear, linear regression can be used as long as the regression is linear in the parameters.

Even if the dependent variable is nonlinear, linear regression can be used as long as the regression is linear in the parameters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Classic normal linear regression model assumptions

A
  1. The relationship between the dependent variable, Y, and the independent variable, X is linear in the parameters b0 and b1. This requirement means that b0 and b1 are raised to the first power only and that neither b0 nor b1 is multiplied or divided by another regression parameter (as in b0/ b1, for example). The requirement does not exclude X from being raised to a power other than 1.
  2. The independent variable, X, is not random.
  3. The expected value of the error term is 0: E( ε) = 0.
  4. The variance of the error term is the same for all observations: , i = 1, …, n.
  5. The error term, ε, is uncorrelated across observations. Consequently, E( εiεj) = 0 for all i not equal to j. 9
  6. The error term, ε, is normally distributed.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

4 steps to determine the prediction interval for the prediction

A
  1. Make the prediction.
  2. Compute the variance of the prediction error using Equation 12.
  3. Choose a significance level, α, for the forecast. For example, the 0.05 level, given the degrees of freedom in the regression, determines the critical value for the forecast interval, tc.
  4. Compute the (1 − α) percent prediction interval for the prediction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Degrees of Freedom

A

The number of observations minus the number of parameters estimated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Estimated variance of the prediction error depends on:

A
  1. the squared standard error of estimate, s^2
  2. the number of observations, n
  3. the value of the independent variable, X, used to predict the dependent variable
  4. the estimated mean
  5. the variance the independent variable.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Elements necessary to calculate test statistic for ANOVA

A
  1. the total number of observations (n)
  2. the total number of parameters to be estimated (in a one-independent-variable regression, this number is two: the intercept and the slope coefficient)
  3. the sum of squared errors or residuals (SSE, residual sum of squares)
  4. the regression sum of squares (RSS, total variation in Y explained in the regression equation)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

95% Confidence Interval

A

The interval, based on the sample value (estimated), that we would expect to include the population (true) value with a 95% degree of confidence.

17
Q

Limitations of Regression Analysis

A
  1. Regression relations can change over time, just as correlations can. This fact is known as the issue of parameter instability, and its existence should not be surprising as the economic, tax, regulatory, political, and institutional contexts in which financial markets operate change.
  2. A second limitation to the use of regression results specific to investment contexts is that public knowledge of regression relationships may negate their future usefulness.
  3. Finally, if the regression assumptions listed in Section 2.2 are violated, hypothesis tests and predictions based on linear regression will not be valid.
18
Q

Independent variable

A

A variable used to explain the dependent variable in a regression; a right-hand-side variable in a regression equation.

19
Q

P-Value

A

The p-value is used as an alternative to rejection points to provide the smallest level of significance at which the null hypothesis would be rejected. A smaller p-value means that there is stronger evidence in favor of the alternative hypothesis.

20
Q

Coefficient of Determination

A

Fraction of the total variation that is explained by the regression.

(1-(unexplained variation/total variation))

21
Q

Unbiased

A

Even though forecasts may be inaccurate, we hope at least that they are unbiased— that is, that the expected value of the forecast error is zero. An unbiased forecast can be expressed as E( Actual change – Predicted change) = 0. In fact, most evaluations of forecast accuracy test whether forecasts are unbiased.

22
Q

Regression coefficients

A

The intercept and slope coefficient( s) of a regression.

23
Q

Analysis of variance (ANOVA)

A

The analysis of the total variability of a dataset (such as observations on the dependent variable in a regression) into components representing different sources of variation; with reference to regression, ANOVA provides the inputs for an F-test of the significance of the regression as a whole.

24
Q

Error term

A

The portion of the dependent variable that is not explained by the independent variable( s) in the regression.

25
Q

Elements of Confidence Interval Hypothesis Test

A

1) the estimated parameter value
2) the hypothesized value of the parameter, b0 or b1
3) a confidence interval around the estimated parameter.

26
Q

F -statistic

A

Measures how well the regression equation explains the variation in the dependent variable. The F-statistic is the ratio of the average regression sum of squares to the average sum of the squared errors.

27
Q

Estimated parameters/Fitted parameters

A

With reference to a regression analysis, the estimated values of the population intercept and population slope coefficient( s) in a regression.

28
Q

Confidence Interval

A

An interval of values that we believe includes the true parameter value, b1, with a given degree of confidence.

To compute a confidence interval, we must select the significance level fotr the test and know the standard error of the estimated coefficient.

29
Q

Explanation of Assumption 5

A

Assumption 5, that the errors are uncorrelated across observations, is also necessary for correctly estimating the variances of the estimated parameters and . The reading on multiple regression discusses violations of this assumption.