Multiple linear regression Flashcards

1
Q

What is Multiple Linear Regression (MLR)?

A

MLR models the relationship between one dependent variable (y) and multiple independent variables (x1, x2, …, xn)

The equation for a multiple regression model is not provided in the text.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the difference between Simple Linear Regression and Multiple Linear Regression?

A

Simple Linear Regression involves one dependent variable (y) and one independent variable (x), while MLR involves one dependent variable and multiple independent variables

Simple Linear Regression is a foundational concept for understanding MLR.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is confounding in the context of regression analysis?

A

Confounding occurs when the relationship between an explanatory variable and an outcome is distorted by another variable

Example: Free time influences both exercise and weight.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How can confounding affect the interpretation of regression results?

A

Confounding can lead to attributing the entire observed relationship to causation when it’s actually influenced by another variable

Example: Increased free time leads to more exercise and reduced weight, which can falsely suggest exercise alone causes weight loss.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the steps to identify confounding in regression models?

A
  1. Check if the confounder is associated with both the independent and dependent variables. 2. Use regression models to quantify these associations

Example: Analyzing the relationship between poverty, crime rate, and percentage of single parents.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the significance of sample size in multiple regression models?

A

A sufficient sample size is crucial; typically, at least 10 observations per variable are recommended

This helps eliminate bias and improve causal inference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does the Coefficient of Determination (R²) represent in multiple regression?

A

R² indicates the proportion of variance in the dependent variable that can be explained by the independent variables

A higher R² value suggests a better fit of the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the first assumption of Multiple Regression?

A

Linearity: The relationship between the dependent variable and each continuous independent variable must be linear.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the second assumption of Multiple Regression?

A

Normal Distribution of Residuals: The residuals should be normally distributed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the third assumption of Multiple Regression?

A

Homoscedasticity: Residuals should have constant variance across all predicted values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the fourth assumption of Multiple Regression?

A

Independence of Observations: Each observation in the dataset must be independent of others.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can you check for homoscedasticity in regression analysis?

A

By plotting standardized residuals against standardized predicted values and looking for a random scatter with no discernible pattern.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the implication of finding heteroscedasticity in a regression model?

A

It indicates that the variance of residuals is not constant, which violates the assumption of homoscedasticity

Remedies include transforming the dependent variable or using robust standard errors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In the context of regression, what does a P-P plot indicate?

A

A P-P plot helps assess the normality of residuals by showing how closely the points align with the diagonal

Points closely aligned with the diagonal support the assumption of normality.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a practical implication of violations in regression assumptions?

A

Violations can lead to inaccurate conclusions and necessitate adjustments in the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Fill in the blank: The relationship between the dependent variable and each continuous independent variable must be _______.

A

linear

17
Q

True or False: Independence of observations can be tested directly in software.

A

False