3A Flashcards

1
Q

What is a spurious correlation?

A

A correlation between two variables (X and Y) that appears significant but is actually caused by a third variable (Z). X does not truly cause Y.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Example of a spurious correlation?

A

Watching GTST and support for redistribution. The actual cause might be gender or income.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a suppressor variable?

A

A variable that hides or weakens a real relationship between X and Y by affecting them in opposite directions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Example of a suppressor variable?

A

Income and left-right identification. Education may suppress the expected relationship between them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is multiple regression?

A

A statistical technique used to predict a dependent variable (Y) based on multiple independent variables (X’s).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does multiple regression differ from simple regression?

A

Simple regression has one X variable, while multiple regression includes several X variables to control for confounders.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why use multiple regression?

A

It helps isolate the effect of each X variable by controlling for others, reducing omitted variable bias.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Ordinary Least Squares (OLS)?

A

A method to estimate regression coefficients by minimizing the sum of squared residuals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does OLS work?

A

It finds the best-fitting line by minimizing the difference between predicted and actual Y values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why should you include control variables?

A

To avoid omitted variable bias and improve the accuracy of estimated effects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What happens if you include too many variables?

A

It can cause interpretation issues, overfitting, multicollinearity, and may accidentally remove a real effect.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is multicollinearity?

A

When two or more independent variables are highly correlated, making it difficult to separate their effects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a mediator variable?

A

A variable that explains the relationship between X and Y

(e.g., income → redistribution attitudes → left-right identification).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do you decide which variables to include?

A

1) If they reduce omitted variable bias.
2) If they are strong predictors of Y.

Avoid irrelevant, overfitting, or multicollinear variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the two stepwise model specification methods?

A

1) Start with few variables and add more gradually.
2) Start with many and remove non-significant ones.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the strengths of multiple regression?

A
  • Controls for confounders.
  • Helps distinguish real vs. spurious relationships.
  • Can be used to test causal claims.
17
Q

What are the limitations of multiple regression?

A
  • Can’t control for all variables.
  • Correlation still does not imply causation.
  • Requires careful variable selection.