Correlations and linear regression Flashcards

1
Q

Define

Correlations

A

Statistical technique for measuring the extent to which two variables are associated Measures the pattern of responses across variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define

Linear regression

A

a linear model to predict the value of one variable from another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define

One-tailed

A

a statistical test in which the critical area of a distribution is one-sided so that it is either greater than or less than a certain value, but not both

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define

Two-tailed

A

a method in which the critical area of a distribution is two-sided and tests whether a sample is greater than or less than a certain range of values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define

Variance

A

tells us how much scores deviate from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define

Covariance

A

similar to the variance, but tells us how much on two variables differ from their means

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define

Correlation coefficient

A

The standardised version of covariance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define

Pearson correlation coefficient

A

a parametric statistic that measures linear correlation between two variables X and Y. It has a value between +1 and −1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Define

Correlation matrix

A

a table showing correlation coefficients between variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Define

Post hoc

A

statistical analyses that were specified after the data were seen

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define

Spearman correlation

A

a nonparametric measure of rank correlation. It assesses how well the relationship between two variables can be described using a monotonic function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define

Monotonic

A

relationships that are consistently one-directional

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Define

Coefficient of determination (r^2)

A

the proportion of the variance in the dependent variable that is predictable from the independent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Define

Shared variance

A

the extent to which two variables vary together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Define

Partial correlation

A

Measures the relationship between two variables, controlling for the effect that a third variable has on them both

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Define

Semi-partial (part) correlation

A

Measures the relationship between two variables, controlling for the effect that a third variable has on one of the others

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Define

Zero order correlation

A

the correlation between two variables when you do not control for anything

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Define

Directionality problem

A

it is not possible to determine which variable is the cause, and which is the effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Define

Residual

A

The difference between the observed value of the dependent variable (y) and the predicted value (ŷ)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Definition

Statistical technique for measuring the extent to which two variables are associated Measures the pattern of responses across variables

A

Correlations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Definition

a linear model to predict the value of one variable from another

A

Linear regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Definition

a statistical test in which the critical area of a distribution is one-sided so that it is either greater than or less than a certain value, but not both

A

One-tailed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Definition

a method in which the critical area of a distribution is two-sided and tests whether a sample is greater than or less than a certain range of values

A

Two-tailed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Definition

tells us how much scores deviate from the mean

A

Variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Definition

similar to the variance, but tells us how much on two variables differ from their means

A

Covariance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Definition

The standardised version of covariance

A

Correlation coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Definition

a parametric statistic that measures linear correlation between two variables X and Y. It has a value between +1 and −1.

A

Pearson correlation coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Definition

a table showing correlation coefficients between variables

A

Correlation matrix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Definition

statistical analyses that were specified after the data were seen

A

Post hoc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Definition

a nonparametric measure of rank correlation. It assesses how well the relationship between two variables can be described using a monotonic function

A

Spearman correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Definition

relationships that are consistently one-directional

A

Monotonic

32
Q

Definition

the proportion of the variance in the dependent variable that is predictable from the independent variable

A

Coefficient of determination (r^2)

33
Q

Definition

the extent to which two variables vary together

A

Shared variance

34
Q

Definition

Measures the relationship between two variables, controlling for the effect that a third variable has on them both

A

Partial correlation

35
Q

Definition

Measures the relationship between two variables, controlling for the effect that a third variable has on one of the others

A

Semi-partial (part) correlation

36
Q

Definition

the correlation between two variables when you do not control for anything

A

Zero order correlation

37
Q

Definition

it is not possible to determine which variable is the cause, and which is the effect

A

Directionality problem

38
Q

Definition

The difference between the observed value of the dependent variable (y) and the predicted value (ŷ)

A

Residual

39
Q

____________exists when changes in one variable tend to be accompanied by persistent and predictable changes in the other variable

A

Association/Relationship exists when changes in one variable tend to be accompanied by persistent and predictable changes in the other variable

40
Q

What is the range of correlation values?

A

-1 to +1

41
Q

What does a correlation of 0 indicate?

A

No association (e.g. null hypothesis)

42
Q

What does the significance of a correlation depend on?

A

Sample size (n)

Alpha value (one vs two-tailed)

43
Q

Size of correlation (ignoring direction), must be ___ critical value given for that df

A

Size of correlation (ignoring direction), must be critical value given for that df

44
Q

The _______tells us how much scores deviate from the mean

A

The variance tells us how much scores deviate from the mean

45
Q

The ___________of the two variables is similar to the variance, but tells us how much on two variables differ from their means

A

The covariance of the two variables is similar to the variance, but tells us how much on two variables differ from their means

46
Q

What is the variance equation?

A
47
Q

What is the covariance equation?

A
48
Q

What are some of the issues with covariance?

A

It depends upon the units of measurement

e.g., The covariance of two variables measured in miles might be 4.25, but if the same scores are converted to kilometres, the covariance is 11

49
Q

How do you get around the issue of units in covariance?

A

Standardise it

(divide the SD of both variables)

50
Q

The standardised version of covariance is known as the ______________

A

The standardised version of covariance is known as the correlation coefficient

51
Q

What happens to the range when you standardise the covariance?

A

It goes from being unrestricted to between -1 and +1

52
Q

What is the parametric correlation coefficient?

A

Pearson

53
Q

What is the formula for the Pearson/Spearman correlation coefficient?

A
54
Q

Why is it not necessarily a good idea to run a correlation matrix for all your variables?

A

The more tests you run, the more likely a false positive will occur

55
Q

What are the assumptions of a Pearson correlation?

A

Both variables measured on an interval or ratio scale

Both variables should be normally distributed (For statistical inference)

Relationship between the variables must be linear

56
Q

What happens if normality is violated when you want to run a Pearson correlation?

A

If N > 30, can use central limit theorem to justify proceeding, despite violation

Otherwise, use a Spearman correlation

57
Q

What happens if linearity is violated when you want to run a Pearson correlation?

A

If relationship is monotonic, use a Spearman correlation

Otherwise, try transforming the data to achieve linearity

58
Q

What does a Spearman correlation measure?

A

It measures the association between two ordinal variables; that is, X and Y both consist of ranks

It measures the consistency of direction of the association between two interval/ratio variables

59
Q

Why is a Spearman correlation less powerful than a Pearsons?

A

The continuous data must be converted into ranks before conducting the correlation

60
Q

True or False:

You could conduct a Spearman correlation on this data

A

False

It is nonmonotonic

61
Q

What statistic do you use to measure the relationship strength of a correlation?

A

Coefficient of determination (r2)

62
Q

If r = .411, what is the coefficent of determination?

A

r2 = .169

Therefore, 16.9% shared variability

63
Q

a ____________ is the correlation between two variables when you control (i.e., hold constant) the effects of a third variable on both of the other variables

A

a partial correlation is the correlation between two variables when you control (i.e., hold constant) the effects of a third variable on both of the other variables

64
Q

What does this equation show?

A

Shows the association between and Y and X1 after removing the overlap of X2 with X1, and with Y

65
Q

Why is semi-partial correlation used in multiple regression?

A

Used in multiple regression because if you square the semi-partial correlation, it tells you the variability in the outcome uniquely accounted for by one specific predictor variable

i.e., controls for the relationships between predictors, so the outcome variable’s relationship with other predictors is still taken into account

66
Q

What does this equation show?

A

How much of the total variability in Y is uniquely explained by X1

67
Q

What is the difference between partial correlation and semi-partial correlation?

A

Partial correlation:

Measures the relationship between two variables, controlling for the effect that a third variable has on them both

Semi-partial correlation:

Measures the relationship between two variables, controlling for the effect that a third variable has on only one of the others

68
Q

What is a 1st and 2nd order correlation?

A

1st order correlation = partial correlation that controls for 1st variable

2nd order correlation = partial correlation that controls for 2 variables

69
Q

If I find that the Pearson correlation between exam performance and revision time is .384 for males and .442 for females, can I test whether the correlation is ‘significantly’ stronger for females than males?

A

The best way to test whether the association between two variables differs by group is to use multiple regression with interactions, which we will discuss later

70
Q

Can partial correlations be performed on non-parametric data? (i.e., is there a way to do Spearman partial correlations?)

A

Yes! Use Spearman’s partial rank order correlation

71
Q

What are the two options for dealing with missing data?

A

Exclude cases pairwise

Exclude cases listwise

72
Q

What happens if we exclude cases pairwise?

A

For EACH correlation, exclude participants who do not have a score for both variables

73
Q

What happens if we exclude cases listwise?

A

Across ALL correlations, exclude participants who do not have a score for every variable

74
Q

What is the linear regression equation?

A

Yi = b0 + b1X1 + εi

75
Q

What do the dots and the lines represent?

A

Dots = actual scores

lines = difference between actual scores and predicted scores (residuals)

76
Q

What is the regression line equation of this data?

A

Yi = b0 + b1X1 + εi

Yi = 43.9 + 0.65X + εi

77
Q

What does this value tell us?

A

Here .411 means that a 1 SD increase in revision time is expected to relate to a .411 SD increase in exam performance