10. Correlation and Regression Flashcards

1
Q

what is the bivariate case?

A

the case where there is one predictor (IV) and one criterion variable (DV)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is the abbreviation for pearson’s correlation coefficient?

A

r

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is the range of pearson’s correlation coefficient?

A

-1 and +1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what do negative and positive values of pearson’s correlation coefficient indicate??

A

the direction and strength of a linear relationship between an X and Y variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what does a negative pearson’s correlation indicate?

A

indicates a relationship where increases in one variable are associated with decreases in the other and vice versa

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what does a positive pearson’s correlation indicate?

A

indicates a relationship where increases in one variable are associated with increases in another and decreases with decreases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is a pearson’s correlation coefficient designed to test?

A

for a linear relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is pearson’s correlation coefficient based on?

A

the ability to invisibly draw a straight line through the data point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what must a pearson’s correlation coefficient be?

A

a linear relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is the formula for pearson’s correlation coefficient (r)?

A

r =

   N
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is the procedure for calculating r?

A

for each score, calculate its corresponding Z score (Z_x and Z_y) ensuring that you use the correct mean and Sd

Multiply the Z_x by the Z_y to get the cross product ensuring that the sign (+ or -) is correct

add all the crossproducts (ΣZ_XZ_Y)

and divide by the number of pairs of scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what do Z scores tell us

A

whether a score (X or Y) is above (+Z) or below (-Z) a mean (M)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what do we get if we multiply Z values on X and Y for each person?

A

crossproducts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what does it mean when we get mostly positive cross products?

A

positive correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what does it mean when we get mostly negative cross products?

A

negative correlatin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what does it mean when we get an equal number of + and - cross products?

A

no correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what are the characteristics of a positive correlation?

A
  • mean on X (+ZX), above mean on Y (+ZY)
  • Below mean on X (-ZX), below mean on Y (-ZY)
  • ZXZY values are mostly positive (multiply two negatives = positive)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what are the characteristics of a negative correlation?

A
  • Above mean on X (+ZX), below mean on Y (-ZY)
  • Below mean on X (-ZX), above mean on Y (+ZY)
  • ZXZY values are mostly negative
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what are the characteristics of no correlation?

A
  • Position on X (+ZX or -ZX) not linked to position on Y (+ZY or -ZY)
  • ZXZY values are equally positive and negative
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what does it mean if r is closer to +1

A

there is a strong positive relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

what does it mean if r is close to -1

A

strong negative relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what does it mean if r is close to 0

A

weak relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

what can correlation tell us?

A

the direction of our relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

what can regression help us with?

A

helps us actually plot the line that the correlation metaphorically draws through the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

what can the line that regression assists in drawing help with?

A

helps us to predict scores on our DV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

what is correlation coefficient a measure of

A

how close to the line the data is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

why is it called the line of best fit?

A

because it is drawn in the position that minimises the distances of alll the data points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

what does the line of best fit influenced by??

A

every data point in the scatterplot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

where does the line of best fit carefully place itself?

A

in a position that overall is closest to all teh data points that it can be

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

what are we calculating with regression

A

the ordinary leased regression (OLR)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

what are the two pieces of information that we need to calculate the line of best fit based on our x and Y scores?

A

the slope and the Y-axis intercept

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

what is the slope?

A

how steep it is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

what is the Y-axis intercept?

A

where the line starts from on the y axis
where the line of best fit intercepts the y axis
it is the predicted value of Y when X is zero

34
Q

what is a slope?

A

an indication of the gradient of the line OR its steepness

35
Q

what is the slope also known as?

A

the regression coefficient

36
Q

what is the slope in mathematical terms

A

it is how many units of the Y variable (DV) you increase for every unit increase in your X variable (IV)

37
Q

what is the formlua for the slope?

A

b =

               S_x
38
Q

what does b stand for?

A

the gradient

39
Q

what does a larger b figure mean?

A

a stronger gradient between X and Y - the steeper the line

40
Q

what is the gradient another term for?

A

steepeness or the slope

41
Q

instead of Z scores to represent movement on our variables, wht are we using?

A

deviation scores

42
Q

what is the equation for the sum of squared deviations for the X variable?

A

SS_x =

∑ (X - M_x)^2

43
Q

what dont we need to do for the exam?

A

dont need to knwo how to calculate the linear regression equation from scratch but you do need to interpret regression on an SPSS output

44
Q

what is the letter that represents intercept?

A

a

45
Q

how do we find the y interept?

A

multiply our slope (b) by the mean of our IV (X) and then subtract that from the mean of our DV (Y)

46
Q

what is the equation for y intercept?

A

a =

M_y - (b)(M_x)

47
Q

what is the regression equaiton?

A

Ÿ = a + (b)(X)

48
Q

what does the regression formula allow us to do

A

allows us to predict what a person;s score on the Y variable is likely to be if you know their score in the X variable and the relationship between the X and Y variable

49
Q

what is Ÿ?

A

the predicted score on the Y variable

50
Q

what does the hat ontop of the y indicate?

A

it is “predicted” as opposed to an actual score

51
Q

what is a?

A

the y-intercept. or the point at which the regression line crosses the Y axis (hence where x equals 0)

52
Q

what is b?

A

the slope. the gradient or steepness of the regression line

53
Q

what is X?

A

the person’s score on the X variab;e

54
Q

what is a residual score?

A

the difference between the actual score and the predicted score

55
Q

what is the equation for residual?

A

R_y = Y - Ÿ

56
Q

what does a residual score indicate

A

how far away from the predicted score the actual or raw score is

57
Q

what is the standard error of estimate (SEE)

A

the average or mean of the residual scores would represent for us the average distance of our data points to the regression line
it also represents the average that we would be in error if we use the regression line to predict a persons score

58
Q

what would a high value of the SEE indicate

A

that on average the prediction is inaccurate

59
Q

low SEE means…

A

good for making predictions

60
Q

what unit is SEE always in the unit of?

A

the unit that the DV is measured in

61
Q

what do we get if we sum all the residuals?

A

0

62
Q

what is the formula for SEE?

A

SEE =

Square root of:

   N
63
Q

what is the SEE an underestimate of?

A

the true population SEE and so a correction is to be made

64
Q

what is the correction made when calculating the estimate population of SEE

A

instead of N you use N-2

65
Q

what is the equation for the population SEE?

A

SEE =

Square root of:

   N-2
66
Q

what are the two possibile partitions of the variability in Y-scores

A

variability due to error (SS_error)

varaibility due to regression (SS_reg)

67
Q

what will be the total variability (SS_total)?

A

SS_Y

68
Q

what is SS_y

A

the sum of squared deviations between each Y score and the mean of Y

69
Q

what will be the SS_error?

A

the sum of squared deviations between each Y score and the predicted Y score

70
Q

what is the regression variability (SS_reg)

A

the sum of squared deviations betwene each predicted Y value and the mean of Y

71
Q

what df do we use to calculate the MS with regard to these SSs??

A

1 and N-1

72
Q

what is the f-statistic for regression

A

MS_reg / MS_error

73
Q

what does the f-stat for regression as>

A

is our model significantly predicting?

74
Q

what is r^2 within an ANOVA context equivalent to?

A

SS_regression / SS_total

75
Q

what is an r^2

A

similar to eta squared in ANOVA

76
Q

when F observed is greater than F critical what does this indicate?

A

that regression is not significant

77
Q

what are the assumptions for correlation and regression?

A

normality and linearity

78
Q

what is the assumption of normality?

A

regression and correlation assume that both the X and Y variables have relatively normal distributions

79
Q

what is the assumption of linearity

A

peason’s correlation and linear regression both assume that the relationship they are testing is linear

80
Q

what is the potential problem for correlation and regression

A

impact of outliers / extreme scores

81
Q

what is an outlier

A

data point that lies away from the rest of the pack of data

82
Q

what does correlation not imply?

A

causality