week 4 Flashcards

1
Q

regression

A

extends upon correlation to examine whether we can estimate the values of an outcome variable on the basis of our variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Multiple regression

A

uses multiple predictor variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Factorial ANOVA

A

focuses on differences in scores on the dependent variables, according to two or more independent variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Types of multiple regression

A

Forced entry
Hierarchical multiple regression
Stepwise multiple regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Forced entry

A

Predictors based on previous research and theory
Do not state a particular order of the variables to be entered
All variables are forced into the model at the same time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Hierarchical regression

A

Predictors based on previous research.
Researcher decides the order in which the predictors are entered into the model.
Enter known predictors (based on previous research) first and then enter new predictors (new/more exploratory hypotheses).
New predictors can be entered:
All at once (like in the enter method).
In a hierarchical manner.
In a stepwise manner.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Stepwise methods

A

The most controversial method (for psychologists) as the order the variables are entered is based on maths rather than previous research/ theory.
Both forward and backward methods.
Computer programme selects the predictor that best predicts the outcome and enters that into the model first (in forward methods).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Parts of a regression

A

The regression line (the model)
The line of best fit
Identify how well the model (the regression line) represents the data
Is it significant?
assess this using an ANOVA
How much variance is accounted for by the model (effect size)
R2 value
Examine the relationship between predictor and outcome
The intercepts (the value of y when x = 0)
Betas (standardised and unstandardised, how does Y change in relation to a change in X).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Sample size

A

The old rule –
For every one predictor variable need 10 participants
Rule of thumb no empirical evidence to support this
More is better
Depends on the size of effect you want to find
Field (2010) suggests you use the following equations to identify an appropriate size:
Equation 1: 50 + 8k where k = the number of predictor variables
Equation 2: 104 + k

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Multicollinearity

A

Strong correlation between predictor variables.
Perfect collinearity when you have a correlation of 1 between predictors.
Becomes difficult to interpret the results:
Difficult to identify the predictive value of each individual predictor variable.
Untrustworthy b’s
The beta values give an indication of change in the outcome for every unit change in the predictor.
If the individual predictors are correlated, the betas will be unreliable.
Importance of predictors
Similarly, can’t identify the individual importance of each predictor.
Limits the size of R2
Difficult to identify the proportion of variance accounted for by a particular variable.
Threatens the validity of the model produced.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Identifying Multicollinearity

A

Collinearity statistics
VIF (Variance Inflation Factor)
If the average VIF is substantially greater than 1 then regression may be biased.
If largest VIF is greater than 10 there is definitely a problem.
Tolerance
If tolerance is below 0.1 a serious problem.
If tolerance is below 0.2 a potential problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

To understand homoscedasticity, you need to understand residuals….

A

When we draw a regression line, there will be differences between the data points and the line
The distances between the line and the individual data points are the RESIDUALS
Some of the data points are above the line
The line underestimates the value of Y
Some of the data points are below the line
The line overestimates the value of Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Homoscedasticity

A

At each level of the predictor variable, the variance of the residuals should be constant.
Not the actual residual value, but the variance in the residual values
This is what is meant by homoscedasticity.
If the variance of the residuals are different, we have heteroscedasticity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Independent Errors

A

For any two observations (data points) the residual points should not correlate, they should be independent
To identify whether this is an issue in your analysis, use the Durbin-Watson Test.
This tests correlations across error terms

Durbin-Watson test
Tests whether residuals next to each other are correlated
Test Statistic varies between 0 and 4
Value of 2 means the residuals are uncorrelated
A value greater than 2 indicates a positive correlation
A value lower than 2 indicates a negative correlation
Values greater than 3 and less than 1 indicate a definite problem.
Values close to 2 suggest there is no issue.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Normally distributed errors

A

Often confused with normally distributed data for predictors
That’s not what this means.

Technically means, that the residual values in the regression model are random and normally distributed, with a mean of 0
So in other words, there is an even chance of points lying above and below the best-fit line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly