M1 - Multiple Regression Flashcards

Question 1

Q

Multiple regression can be used when IV's are 
1.\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
 variables and DV's are 
2.\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
 variables.

Multiple regression can be used to determine which variables are important for

3.\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
and which 
4.\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
 variable explains the most 
5.\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
 variance in the 
6.\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
 variable.

Options: 1. prediction; 2. ratio; 3. between groups; 4. discrimination; 5. continuous; 6. unique; 7. independent; 8. continuous or categorical; 9. dependent

Answer

A

1 - continuous or categorical
2 - continuous
3 - prediction
4 - independent
5 - unique
6 - dependent

Question 2

Q

When would hierarchical regression analysis be preferred over standard regression analysis? (Choose all that apply)

The researcher is interested only in the change in R2 statistic. .
The researcher would like to check whether the variance explained in a categorical DV is increased following the inclusion of several continuous IV’s. .
The researcher wishes to determine the effects of one particular IV relative to all others. .
The researcher is interested in examining the effects of three IV’s on the DV whilst controlling for the variance explained by two others. .
The researcher is interested in semi-partial correlations and the overall R2..

Answer

A

The researcher is interested only in the change in R2 statistic. .(as there is a change from step 1 of the hierarchy to step 2)
The researcher is interested in examining the effects of three IV’s on the DV whilst controlling for the variance explained by two others. .

Question 3

Q

What is the semi partial correlation for an IV and how is it represented in SPSS output?

Answer

A

Semi-partial correlation explains the unique relationship between the IV and the DV controlling for other IVs in the model.
In SPSS it is reported in the Coefficients table under the ‘Part’ heading

Question 4

Q

In a scatterplot, what sort of graph indicates there is an issue with collinearity?

Answer

A

If the scatterplot graph shows any type of discernible pattern then there is a potential issue with collinearity

Question 5

Q

CLO 1
What is Multiple Regression?
What variables work in MR?
What four questions is MR useful in answering?

Answer

A

Multiple regression is statistical analysis that looks at the predictive relationships between variables where there are two or more IVs and a single continuous DV.

MR is useful in answering questions about

combined effect of IVs on the variance of a DV (Multiple R2)
relative strength of different predictors in contributing to DV variance
the unique variance of each predictor (semi partial - sr2)
predicting improvement (hierarchical regression)

Question 6

Q

CLO 2 - How do various methods of MR differ? (standard, hierarchical, stepwise)

Answer

A

Standard

Forced entry of all variables at once - unconcerned about order
research question does not specifically indicate variables influence in a certain way
overall combined, relative important and unique importance of the IVs on the DV

Hierarchical

IVs entered in blocks - order determined by research question
overall combined, relative importance and unique importance of the IVs on the DV
PLUS prediction improvement (additional variance explained after controlling for certain IVs in previous blocks)

Stepwise
* Statistical approach where IVs are entered based on statistical criteria eg p value of t test for each IV remove if p

Question 7

Q

CLO 3 - What are the five key pieces of output for overall model and how do they relate to my research question?

Answer

A

Key output for overall model

R - correlation between all IVs and DV

R2 - overall variance explained in the DV by all IVs

Adjusted R2 - adjusts R2 for sample size and # of predictors (greater number of predictors is likely to inflate R2)

F test - significance test for R2 and R2 change

R2 change - difference between steps (blocks) in explained variance for Hierarchical regression (prediction improvement questions)

Question 8

Q

CLO 3 - What are the five key pieces of output for individual predictors in model and how do they relate to my research question?

Answer

A

Key output for Individual predictors

b weight - unstandardised, the amount of change in the predictor for every 1 unit change in the DV

Beta weight - standardised b weight, can use to relate to each other. Useful for relative important of each predictors questions

r - zero order correlation between IV and DV

sr - used to determine sr2 - semi-partial correlation, unique variance explained

t test - significance test of b weight for individual predictors

Question 9

Q

CLO4 - What are the six key multiple regression assumptions?

Answer

A

1 - Normal distribution - the data is normally distributed - outliers not present

2 - Linearity - the relationship between the IV and the DV is linear

3 - Independence of Errors - residuals are not correlated (inflates SE’s –>CIs and sig tests)

4 - Homoscedasticity - variability in residuals should be constant at each level of predictor

5 - Singularity and multicollinearity - variables are not strongly or perfectly correlated

6 - Sample size - n is large enough to detect the effect

Question 10

Q

CLO4 How are the key MR assumptions and influencers checked?

1 - Normal distribution

Answer

A

1 - Normal distribution - the data is normally distributed - outliers not present

Overall model checks

check residuals for outliers
Standardised residuals outside of +/-3.29 is cause for concern
Cooks distance > 1 suggests outiers may be present
Leverage > 3*(k-1)/n suggests outliers may be present

Individual Case and predictor check
* Standardised DFBeta > +/-1 indicates substantial influence. Useful to see where influence occurs separately on each case at either intercept or b weight of the IV

Question 11

Q

CLO4 How are the key MR assumptions and influencers checked?

2 - Linearity

Answer

A

2 - Linearity - the relationship between the IV and the DV is linear

Assess through scatterplots of residuals and predicted scores

Question 12

Q

CLO4 How are the key MR assumptions and influencers checked?

3 - Independence of Errors

Answer

A

3 - Independence of Errors - residuals are not correlated

Violation inflates SE’s –> impacts CIs and sig tests

Check with Durbin Watson Test
- tests the serial correlation between residuals for adjacent cases in the dataset
- No correlation is desireable
range 0-4
<2 positive correlation –> increased type 1 error (underestimates SEs)
2 no correlation
>2 negative correlation –> increased type 2 error (overestimates SEs)

Question 13

Q

CLO4 How are the key MR assumptions and influencers checked?

4 - Homoscedasticity

Answer

A

4 - Homoscedasticity - variability in residuals should be constant at each level of predictor

check with scatterplot
range -2 to 2 is desirable for most cases
no discernible pattern is desireable

Question 14

Q

CLO4 How are the key MR assumptions and influencers checked?

6 - Sample size

Answer

A

Sample size needs to be large enough to detect an effect of the expected strength

Rules of thumb are based on moderate effect size

N>= 50 + 8k for overall model
N>=104 + k for individual predictors

these rules of thumb are likely to overestimate n required if the expected effect is large and underestimate n required if the expected effect is small

Question 15

Q

CLO4 How are the key MR assumptions and influencers checked?

5 - Singularity and multicollinearity -

Answer

A

5 - Singularity and multicollinearity - IVs are not strongly or perfectly correlated

perfect or strong correlations .8 or .9 suggests one of the variables is redundant

inflates SE’s –> impacts CIs and sig tests
results become less stable

Check bivariate correlations for simple relationships
For more complex relationships in MR - check Tolerance, and VIF (Variance Inflation Factor)

Tolerance - range 0-1
- higher scores desireable, indicates variable has a unique contribution
< .1 is serious (only 10% unique contribution)
< .2 is problematic

VIF = 1/Tolerance
> 10 serious
> 5 problematic

sqrt/VIF = amount SE is inflated by
ie VIF of 4 means SE is double the value of what it would be if r = 0

M1 - Multiple Regression Flashcards

(15 cards)