M1 - Multiple Regression Flashcards

1
Q
Multiple regression can be used when IV's are 
1.\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
 variables and DV's are 
2.\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
 variables.

Multiple regression can be used to determine which variables are important for

3.\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
and which 
4.\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
 variable explains the most 
5.\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
 variance in the 
6.\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
 variable.

Options: 1. prediction; 2. ratio; 3. between groups; 4. discrimination; 5. continuous; 6. unique; 7. independent; 8. continuous or categorical; 9. dependent

A
1 - continuous or categorical
2 - continuous
3 - prediction
4 - independent
5 - unique
6 - dependent
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When would hierarchical regression analysis be preferred over standard regression analysis? (Choose all that apply)

  1. The researcher is interested only in the change in R2 statistic. .
  2. The researcher would like to check whether the variance explained in a categorical DV is increased following the inclusion of several continuous IV’s. .
  3. The researcher wishes to determine the effects of one particular IV relative to all others. .
  4. The researcher is interested in examining the effects of three IV’s on the DV whilst controlling for the variance explained by two others. .
  5. The researcher is interested in semi-partial correlations and the overall R2..
A
  1. The researcher is interested only in the change in R2 statistic. .(as there is a change from step 1 of the hierarchy to step 2)
  2. The researcher is interested in examining the effects of three IV’s on the DV whilst controlling for the variance explained by two others. .
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the semi partial correlation for an IV and how is it represented in SPSS output?

A

Semi-partial correlation explains the unique relationship between the IV and the DV controlling for other IVs in the model.
In SPSS it is reported in the Coefficients table under the ‘Part’ heading

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In a scatterplot, what sort of graph indicates there is an issue with collinearity?

A

If the scatterplot graph shows any type of discernible pattern then there is a potential issue with collinearity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

CLO 1
What is Multiple Regression?
What variables work in MR?
What four questions is MR useful in answering?

A

Multiple regression is statistical analysis that looks at the predictive relationships between variables where there are two or more IVs and a single continuous DV.

MR is useful in answering questions about

  • combined effect of IVs on the variance of a DV (Multiple R2)
  • relative strength of different predictors in contributing to DV variance
  • the unique variance of each predictor (semi partial - sr2)
  • predicting improvement (hierarchical regression)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

CLO 2 - How do various methods of MR differ? (standard, hierarchical, stepwise)

A

Standard

  • Forced entry of all variables at once - unconcerned about order
  • research question does not specifically indicate variables influence in a certain way
  • overall combined, relative important and unique importance of the IVs on the DV

Hierarchical

  • IVs entered in blocks - order determined by research question
  • overall combined, relative importance and unique importance of the IVs on the DV
  • PLUS prediction improvement (additional variance explained after controlling for certain IVs in previous blocks)

Stepwise
* Statistical approach where IVs are entered based on statistical criteria eg p value of t test for each IV remove if p

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

CLO 3 - What are the five key pieces of output for overall model and how do they relate to my research question?

A

Key output for overall model

R - correlation between all IVs and DV

R2 - overall variance explained in the DV by all IVs

Adjusted R2 - adjusts R2 for sample size and # of predictors (greater number of predictors is likely to inflate R2)

F test - significance test for R2 and R2 change

R2 change - difference between steps (blocks) in explained variance for Hierarchical regression (prediction improvement questions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

CLO 3 - What are the five key pieces of output for individual predictors in model and how do they relate to my research question?

A

Key output for Individual predictors

b weight - unstandardised, the amount of change in the predictor for every 1 unit change in the DV

Beta weight - standardised b weight, can use to relate to each other. Useful for relative important of each predictors questions

r - zero order correlation between IV and DV

sr - used to determine sr2 - semi-partial correlation, unique variance explained

t test - significance test of b weight for individual predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

CLO4 - What are the six key multiple regression assumptions?

A

1 - Normal distribution - the data is normally distributed - outliers not present

2 - Linearity - the relationship between the IV and the DV is linear

3 - Independence of Errors - residuals are not correlated (inflates SE’s –>CIs and sig tests)

4 - Homoscedasticity - variability in residuals should be constant at each level of predictor

5 - Singularity and multicollinearity - variables are not strongly or perfectly correlated

6 - Sample size - n is large enough to detect the effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

CLO4 How are the key MR assumptions and influencers checked?

1 - Normal distribution

A

1 - Normal distribution - the data is normally distributed - outliers not present

Overall model checks

  • check residuals for outliers
  • Standardised residuals outside of +/-3.29 is cause for concern
  • Cooks distance > 1 suggests outiers may be present
  • Leverage > 3*(k-1)/n suggests outliers may be present

Individual Case and predictor check
* Standardised DFBeta > +/-1 indicates substantial influence. Useful to see where influence occurs separately on each case at either intercept or b weight of the IV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

CLO4 How are the key MR assumptions and influencers checked?

2 - Linearity

A

2 - Linearity - the relationship between the IV and the DV is linear

Assess through scatterplots of residuals and predicted scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

CLO4 How are the key MR assumptions and influencers checked?

3 - Independence of Errors

A

3 - Independence of Errors - residuals are not correlated

Violation inflates SE’s –> impacts CIs and sig tests

Check with Durbin Watson Test
- tests the serial correlation between residuals for adjacent cases in the dataset
- No correlation is desireable
range 0-4
<2 positive correlation –> increased type 1 error (underestimates SEs)
2 no correlation
>2 negative correlation –> increased type 2 error (overestimates SEs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

CLO4 How are the key MR assumptions and influencers checked?

4 - Homoscedasticity

A

4 - Homoscedasticity - variability in residuals should be constant at each level of predictor

check with scatterplot
range -2 to 2 is desirable for most cases
no discernible pattern is desireable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

CLO4 How are the key MR assumptions and influencers checked?

6 - Sample size

A

Sample size needs to be large enough to detect an effect of the expected strength

Rules of thumb are based on moderate effect size

N>= 50 + 8k for overall model
N>=104 + k for individual predictors

these rules of thumb are likely to overestimate n required if the expected effect is large and underestimate n required if the expected effect is small

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

CLO4 How are the key MR assumptions and influencers checked?

5 - Singularity and multicollinearity -

A

5 - Singularity and multicollinearity - IVs are not strongly or perfectly correlated

perfect or strong correlations .8 or .9 suggests one of the variables is redundant

  • inflates SE’s –> impacts CIs and sig tests
  • results become less stable

Check bivariate correlations for simple relationships
For more complex relationships in MR - check Tolerance, and VIF (Variance Inflation Factor)

Tolerance - range 0-1
- higher scores desireable, indicates variable has a unique contribution
< .1 is serious (only 10% unique contribution)
< .2 is problematic

VIF = 1/Tolerance
> 10 serious
> 5 problematic

sqrt/VIF = amount SE is inflated by
ie VIF of 4 means SE is double the value of what it would be if r = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly