Module 3 Flashcards

1
Q

In evaluating a multiple linear model,
O The F-test is used to evaluate the overall regression.
O The coefficient of determination is interpreted as the percentage of variability in the response variable explained by the model.
O Residual analysis is used for goodness of fit assessment
O All of the above

A

All of the above

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The percentage of variability in the response variable explained by the model

A

Coefficient of Determination

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

True or False

Residual analysis is used for goodness of fit assessment.

A

True.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In the presence of near multicollinearity,

O The coefficient of determination decreases
O The regression coefficients will tend to be identified as statistically significant even if they are not
O The prediction will not be impacted
O None of the above.

A

None of the above.

Multicollinearity does not affect R^2 since adding redundant variables c

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When do we use transformations?

O If the linearity assumption with respect to one or more predictors does not hold, then we use transformations of the corresponding predictors to improve on this assumption
O If the normality assumption does not hold, we transform the response variable, commonly using the Box-Cox transformation
O If the constant variance assumption does not hold, we transform the response variable
O All of the above

A

All of the above.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which one is correct?

O The residuals have constant variance for the multiple linear regression model.
O The residuals vs fitted can be used to assess the assumption of independence.
O The residuals have a t-distribution if the error term is assumed to have a normal distribution
O None of the above

A

None of the above.

In a multiple linear regression model, the true errors have constant var

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

T/F Module 3 Topic 3.1 Lesson 4

In the context of the relationship between a response variable and a series of predicting variables, the estimated regression coefficients for conditional and marginal relationships can differ only in magnitude but the sign or direction of the relationship remains the same.

A

False

Presence of different predicting variables influence the overall relatio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

T/F Module 3 Topic 3.2 Lesson 7

The partial F-test when adding one additional variable to the multiple linear regression model is equivalent to testing for statistical significance using the t-test on that same variable.

A

True

The partial F-test essentially tests if an additional variable Z has an

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Module 3 Topic 3.1 Lesson 2

In case of multiple linear regression model with 7 quantitative predicting variables, and one qualitative variable (containing 3 levels D1, D2, D3) and an intercept. Excluding variance estimate term, ____ number of parameters need to be estimated in the regression model.
O 10
O 11
O 12
O 13

A

10

We remove one level to avoid linear dependencies between qualitative var

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Module 3 Topic 3.2 Lesson 7

You performed a simple linear regression analysis, but after considering the link between factors, you chose to switch the predicting and response variables. You expect the following after refitting the regression model to the new data:
O The value of the correlation coefficient will change
O The value of the coefficient of determination will change
O The sign of the slope will change
O The value of SSE will change

A

The value of SSE will change

Least square regression line remains same even if we swap X and Y, hence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

T/F Module 3.1 Lesson 4

You are interested in understanding the relationship between education level and IQ, with IQ as the response variable. In your model, you also include age. Age would be considered a controlling variable while education level would be an explanatory variable.

A

True

Controlling variables can be used to control for bias selection in a sam

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

T/F Topic 3.1 Lesson 2

If a predicting variable is categorical with 5 categories in a linear regression model without intercept, we will include 5 dummy variables in the model.

A

True

When we have qualitative variables with k levels, we only include k-1 du

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The ____ model or simple linear regression captures the association of one predicting variable to the response variable marginally, that means without consideration of other factors.

A

Marginal Model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The ____ or multiple linear regression model captures the association of a predictor variable to the response variable, conditional of other predicting variables in the model.

A

conditional

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

T/F 3.4 Model Interpretation

The estimated regression coefficients for the conditional and marginal relationships can be different, not only in magnitude but also in sign or direction of the relationship.

A

True

The two models used to capture the relationship between a predicting var

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

T/F 3.11 Assumptions and Diagnostics

For a multiple linear regression model to be a good fit, we need the linearity assumption to hold for all quantitative predictive variables.

A

True

If some of the assumptions do not hold, then we interpret the model fit

17
Q

T/F 3.11 Assumptions and diagnostics

The presence of certain types of outliers can impact the statistical significance of some of the regression coefficients of a multiple linear regression model

A

True

Influential point can change the value of the estimated parameters, the

18
Q

3.11 Assumptions and Diagnostics

Data points that are far from the mean of the x’s are called

A

leverage points

19
Q

A data point from from the mean of the x’s and/or y’s which influences the regression model fit significantly

A

Influential Point

Can change the value of the estimated parameters, the statistical signif

20
Q

3.11 T/F

An outlier, including a leverage point may or may not impact the regression fit significantly.

A

True

It may or may not be an influential point.

21
Q

3.11 Assumptions and diagnostics

In multiple linear regression, we can assess the assumption of constant-variance by plotting the standardized residuals against fitted values.

A

True

22
Q

What are the assumptions in multiple linear regression

A
  • Linearity Assumption
  • Constant Variance Assumption
  • Independence Assumption
  • Normality Assumption
23
Q

Residual Analysis

What is used to evaluate constant variance and uncorrelated errors?

In a multiple linear regression model

A

Response variable or fitted values vs residuals

24
Q

T/F 3.11 Assumptions and diagnostics

Cook’s distance (Di) measures how much the fitted values in a multiple linear regression model change when the ith observation is removed.

A
25
Q

For the standard regression model under normality, we use the ________ to test for the overall regression.

A

F-Test