Multiple Regression Flashcards

1
Q

When do we use single variable linear regression?

A

to investigate the relationship between a dependent variable and one independent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When do we use multiple regression?

A

to investigate the relationship between a dependent variable and multiple independent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Gross relationship =

A

Gross Relationship: A single variable linear regression model determines the gross effect of an independent variable on a dependent variable. For example, the gross effect of house size on selling price is the average change in selling price when house size increases by one square foot. Since no other independent variables are included in the model, the coefficient for house size may pick up the effect of other factors related to selling price.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Net relationship =

A

Net Relationship: A multiple regression model determines the net effect of an independent variable on a dependent variable. The net effect controls for all other factors (independent variables) included in the regression model. For example, in a regression model including both distance and house size as independent variables, the coefficient for house size controls for distance. That is, the regression determines the average change in selling price if a house’s size increases by one square foot but its distance from Boston does not change. Coefficients in multiple regression are net with respect to variables included in the model and gross with respect to variables that are omitted from the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Forecasting in Excel

A

we can use Excel’s SUMPRODUCT function, =SUMPRODUCT(array1, [array2], [array3],…), to calculate a forecast from Excel’s regression output. The SUMPRODUCT function multiplies each value of the first array by the corresponding value in the second array and returns the sum of all those products.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which model would we use to predict the price of a house that is 2,700 square feet?

A

SellingPrice=13,490.45+255.36(HouseSize)

Since we have data about just one independent variable, we should use a single variable regression model. This is a single variable linear regression model, in which house size is the only independent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Use the single variable regression model with house size as the independent variable to predict the selling price of a house that is 2,700 square feet.

A

The expected selling price of a 2,700 square foot home is B2+B3*2700=$702,972.54. Intercept Coefficient + House Size Coefficient X Predicted Size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Suppose we want to forecast selling price based on house size and distance from Boston. Which equation should we use to forecast the price of a house that is 2,700 square feet and 15 miles from Boston?

A

SellingPrice=194,986.59+244.54(HouseSize)–10,840.04(DistancefromBoston)
Since we have data about two independent variables, house size and distance from Boston, we should use the multiple regression model with those two variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

ŷ =a+bx shows

A

The structure of the single variable linear regression line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

A coefficient in a single variable linear regression characterizes …

A

the gross relationship between the dependent variable and the independent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In single variable regression, to measure the predictive power of a single independent variable we use:

A

R2: the percentage of the variation in the dependent variable explained by the independent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In multiple regression, to measure the predictive power of a single independent variable we use

A

Adjusted R2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What insights do residual plots give to SINGLE vs MULTIPLE regression?

A

the residual plot for the single variable regression gives us insight into the gross relationship

the residual plot for multiple regression gives insight into the net relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do we Test for Significance of Variables?

A

We should also analyze the p-values of the independent variables to determine whether there is a significant relationship between the variables in the model. If the p-value of each of the independent variables is less than 0.05, we conclude that there is sufficient evidence to say that we are 95% confident that there is a significant linear relationship between the dependent and independent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are residuals?

A

The residuals are the difference between the historically observed values and the values predicted by the regression model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When does multicollinearity occurs?

A

Multicollinearity occurs when two independent variables are so highly correlated that it is difficult for the regression model to separate the effect each variable has on the dependent variable.

17
Q

How many dummy variables should we use?

A

we should always use one fewer dummy variables than the number of options in a category.

18
Q

How can you reduce multicollinearity?

A

We may be able to reduce multicollinearity by either increasing the sample size or removing one (or more) of the collinear variables.

19
Q

How do you indicate that multicollinearity occured?

A

Indications of multicollinearity include seeing an independent variable’s p-value increase when one or more other independent variables are added to a regression model.

20
Q

When is Dummy variable = 1?

A

A dummy variable is equal to 1 when the variable of interest fits a certain criterion. For example, a dummy variable for “Saturday” would equal 1 for observations relating to Saturdays and 0 for observations related to all other days of the week.

21
Q

FreeIndependent, Group, Rate, SpecialEvent, TotalRewards, VIP, and Wholesale

Quantitative OR Qualitative?

A

Quantitative

FreeIndependent, Group, SpecialEvent, VIP, and Wholesale are quantitative because they represent the number of rooms registered on a given day to a particular type of guest.

22
Q

2010, 2012, Christmas, NewYears, MemorialDay, PayDay, NewYears, SuperBowl, and Thanksgiving

Quantitative OR Qualitative?

A

Qualitative

For example, the variable Christmas tells us whether the observation occurs on Christmas or not. Likewise, 2010 tells us whether the observation occurs during 2010 or not.

23
Q

If we want to investigate the relationship between a dependent variable and one independent variable, what do we use?

A

single variable linear regression

24
Q

If we want to investigate the relationship between a dependent variable and multiple independent variables, what do we use?

A

multiple regression

25
Q

What’s different when forecasting for single vs multiple regression?

A

Forecasting with a multiple regression equation is very similar to forecasting with a single variable linear model. However, instead of entering only one value for a single independent variable, we input a value for each of the independent variables.

26
Q

What happens to R2 when independent variables are added to regression?

A

R2 never decreases when independent variables are added to a regression. it is important to multiply it by an adjustment factor when assessing and comparing the fit of a multiple regression model. This adjustment factor compensates for the increase in R2 that results solely from increasing the number of independent variables.

27
Q

How to test whether the relationship between the independent and dependent variables is linear and significant?

A

by analyzing the regression’s residual plots and the p-values associated with each independent variable’s coefficient.