Data Analysis IIb: Advanced Regression Topics (Week 7) Flashcards

1
Q

What is multicollinearity?

A

One assumption of Ordinary Least Squares is that the IVs are INDEPENDENT of each other

  • > Assumption is violated: Multicollinearity
  • Multicollinearity has nothing to do with DV, only about IVs being highly related to one another
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do we detect multicollinearity?

A

From correlation matrix

Very high correlations indicate multicollinearity

Cutoff arbitrary: e.g. 0.9 (sign does not matter)

Limitation: Only BIVARIATE relationships

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Apart from the correlation matrix, what is a better approach to detect multicollinearity?

A

Variance Inflation Factors (VIFs)

- Looks at multiple IVs at once (instead of only pairs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do we obtain VIFs?

A

Estimate regression models by using IVs as DVs

Example:
X1i = βo + β1 X2i + β2 X3i + βk Xk+1i + εi
X2i = βo + β1 X1i + β2 X3i + βk Xk+1i + εi

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do we determine the VIF of models?

A

VIF = 1 / (1-R^2)

Cutoff arbitrary: e.g. 10

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does having a high VIF score mean?

A

The effect of that variable is captured by other variable(s)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What do we do when VIF>10?

A

If VIF >10, need to drop some variables from model b/c they’re highly correlated with each other.

Leave out the variable with the highest VIF score OR Combine variables in one variable (use factor analysis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are dummy variables?

A

Represent categorical variables in data analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the advantages of using dummy variables in data analysis?

A

Use age as continuous variable in reg. model
– a linear relationship is assumed.

V restrictive assumption that increments are constant/=
Dummy variables overcome this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are moderators?

A

Variables that affect the r/s b/w IV and DV

Do not directly affect the DV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do we estimate moderation?

A

Include complaints * apology (i.e. interaction term) in the reg. equation, with main effects of complaints and apology

How well did you know this?
1
Not at all
2
3
4
5
Perfectly