Module 4 Flashcards

1
Q

What are the two primary purposes for using regression?

A
  1. Studying the magnitude and structure of a relationship between two variables.
  2. Forecasting a variable based on its relationship with another variable.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the structure of the single variable linear regression line?

A

y^=a+bx.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the expected value of y, the dependent variable, for a given value of x.?

A

y^

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the independent variable, the variable we are using to help us predict or better understand the dependent
variable.?

A

x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the y-intercept, the point at which the regression line intersects the vertical axis. This is the value of y^
when the independent variable, x, is set equal to 0.?

A

a

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the slope, the average change in the dependent variable y as the independent variable x increases by
one?

A

b

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The true relationship between two variables is described by the equation y=α+βx+E, where E is the ______ y–y^.

A

error term

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The idealized equation that describes the _________ is y^=α+βx.

A

true regression line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

We determine a _________ by entering the desired value of x into the regression equation.

A

point forecast

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

We must be extremely cautious about using regression to forecast for values outside of the historically observed range of the _________ variable (x-values).

A

independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Instead of predicting a single point, we can construct a __________, an interval around the point forecast that is likely to contain, for example, the actual selling price of a house of a given size.

A

prediction interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The ______ of a prediction interval varies based on the standard deviation of the regression (the standard error of the regression), the desired level of confidence, and the location of the x-value of interest in relation to the historical values of the independent variable.

A

width

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

It is important to evaluate several metrics in order to determine whether a __________________ model is a good fit for a data set, rather than looking at single metrics in isolation.

A

single variable linear regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

_______ measures the percent of total variation in the dependent variable, y, that is explained by the regression line.

A

R2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

R2 = Regression Sum of Squares/____________

A

Total Sum of Squares

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

___≤R2≤____

A

0; 1

17
Q

For a single variable linear regression, R2 is equal to the _______ of the correlation coefficient.

A

square

18
Q

In addition to analyzing R2, we must test whether the relationship between the dependent and independent variable is significant and whether the linear model is a good fit for the data. We do this by analyzing the _______ (or confidence interval) associated with the independent variable and the regression’s ________.

A

p-value; residual plot

19
Q

The _______ of the independent variable is the result of the hypothesis test that tests whether there is a significant _______ relationship; that is, it tests whether the slope of the regression line is zero, H0: β=0 and Ha:β≠0.

A

p-value; linear

20
Q

If the coefficient’s p-value is less than _____, we reject the null hypothesis and conclude that we have sufficient evidence to be ______ confident that there is a significant linear relationship between the dependent and independent variables.

A

0.05; 95%

21
Q

Note that the p-value and R2 provide different information. A _________ can be significant (have a low p-value) but not explain a large percentage of the variation (not have a high R2.)

A

linear relationship

22
Q

A ___________ associated with an independent variable’s coefficient indicates the likely range for that coefficient.

A

confidence interval

23
Q

If the 95% confidence interval does not contain _____, we can be _____ confident that there is a significant linear relationship between the variables.

A

zero, 95%

24
Q

__________ can provide important insights into whether a linear model is a good fit.

A

Residual plots

25
Q

Each observation in a data set has a residual equal to the historically observed value _____ the regression’s predicted value, that is, ______.

A

minus; E=y-y^

26
Q

Linear regression models assume that the regression’s residuals follow a normal distribution with a mean of
_____ and _____ variance.

A

zero; fixed

27
Q

We can also perform regression analyses using qualitative, or categorical, variables. To do so, we must convert
data to ____________.

A

dummy (0, 1) variables

28
Q

A dummy variable is equal to ____ when the variable of interest fits a certain criterion.

A
  1. For example, a dummy variable for “Female” would equal 1 for all female observations and 0 for male observations.
29
Q

Where do you go to add the best fit line to a scatter plot in excel?

A

Using the Insert menu

30
Q

What is a convenient function for calculating point forecasts?

A

=SUMPRODUCT(array1, [array2], [array3],…)

31
Q

Where do you go to create a regression output table in excel?

A

regression output table using the Data Analysis tool

32
Q

How do you create regression models with dummy variables in excel?

A

=IF(logical_test,[value_if_true],[value_if_false])

→ Returns value_if_true if the specified condition is met, and returns value_if_false if the condition is not
met.