Multiple Regression Flashcards

1
Q

When is multiple regression used?

A

When there are multiple continuous predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does B1X1 mean in the model?

A

The slope of variable 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does B1X1i mean in the model?

A

The expected value for person i on variable 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What additional assumptions are in multiple regression?

A

Multicollinearity- There mustn’t be too high linear relation between predictor variables (measuring same thing)
Linearity- the predictor must have a linear relation with the outcome variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can multicollinearity be assessed? (4)

A

Correlations
Matrix scatterplot
VIF: max <10 mean <1
Tolerance > 0.2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can linearity be checked? (2)

A

correlations

Matrix scatterplot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you check if a further variable adds anything to your model on spss?

A

Add them in block two on spss

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What save options do you add?

A

All the distances, you can add unstandardised predicted values to see what the predicted values are for each participant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do you check for Homoscedasticity?

A

plots > zpred and zresid

both z ones

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you know which model explains the variance more?

A

The second model will have a sig. correlation under model summary, first model will always be sig.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In a 3D scatter plot, where is the explained variance and the unexplained variance?

A

Explained- distance between the average score and “green” plane, unexplained - difference between red dots and green plane

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do we get the expected values for person i based on a model?

A

Fill in the slopes for b1 b2 etc in the regression equation, add in there scores and add the intercept at b0 and calculate the equation for person i

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does the R represent under model summary?

A

The correlation between the model (predicted values) and the actual values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What happens if you square this r?

A

Explained variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is meant by the assumption of normality

A

the residuals of the model are normally distributed, or the sampling distribution of the parameter is. This assumption doesn’t refer to the data themselves being normally distributed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is meant by the assumption of additivity and linearity?

A

the relationship between the dependent variable and the independent variable(s) is linear. This is the most important assumption.

17
Q

How can you test normality in spss?

A

Probability-probability plot (P-P plot) = analyze>descriptive statistics>P-P plots. Drag the dependent variable to variables.
Shapiro-Wilk and Kolmogorov-Smirnov test = analyze>descriptive statistics>explore. There you click on plots and select normality plots with tests to produce the K-S and S-W test.

18
Q

What does the Probability- probability plot do?

A

a plot of the cumulative probability of a variable against the cumulative probability of a particular distribution (normal distribution in this case). If the data are normally distributed you’ll get a straight diagonal line.

19
Q

When do you use the Shapiro-Wilk test and when do you use the Kolgmorov -Smirnov test?

A

The S-W is used for samples smaller than 2000, otherwise the K-S is used. The test should be non- significant: then the distribution of the sample is not significantly different from a normal distribution.

20
Q

How do you test for linearity and homoscedasticity?

A

Scatterplot- Both can be tested at the same time using a scatterplot plotting the values of the residuals (ZRESID) against the corresponding values of the outcome predicted by our model (ZPRED). There should be no systematic relationship between the errors and the predicted values (just a straight horizontal line).

21
Q

How can independence be tested?

A

Durbin-Watson test- a test that looks for serial correlations between errors. It varies between 0 and 4, with a value of 2 meaning the residuals are uncorrelated. Values less than 1 or more than 3 are cause for concern.
Access the test with analyse> regression> linear> statistics> Durbin-Watson.

22
Q

How can you reduce bias?

A

TWAT
Trimming- deleting a quantity of data from the extremes. It could be deleting the data from the person who contributed the outlier.
Winsorzing- substituting outliers with the highest value that isn’t an outlier.
Applying robust methods- A commonly used robust method is bootstrapping. This can be done using SPSS.
Transforming- applying a mathematical function to all scores. There are various transformations that you basically try out until one helps.

23
Q

When should trimming be done?

A

this should only be done if there is a good reason to believe this case is not from the population you intended to sample.

24
Q

What rules may trimming abide by? (2)

A

(1) a percentage based rule (deleting the 10% of highest and lowest scores), or;
(2) a standard deviation based rule: calculating the mean and standard deviation of a set of scores, and
then removing values that are a certain number of standard deviations greater than the mean.