Correlation Flashcards
Why are scatter plots useful?
They provide a very useful visual representation of the relationship between two variables.
List a weakness of scatter plots.
Interpretation is subjective.
What is correlation?
The degree to which two variables ‘vary together’ or are ‘related’.
When are variables correlated?
When there is some change in one variable at the same time as there is a change in another variable.
What does Pearson’s Correlation Coefficient (r) measure?
The strength of linear (straight line) relationship between two variables.
What two things are required to calculate r?
- the spread of values for each individual variable (standard deviation)
- the extent to which the variables co-vary (called co-variance)
What is covariance?
A measure that takes information about how far each data point is from the means of both variables.
How do we calculate covariance?
cov (x,y) = the sum of (xi - mean of x) * (yi - mean of y) / n-1
How do we calculate r?
r = cov(x,y) / s.d. (x) * s.d.(y)
r = the covariance divided by the product of the two standard deviations
What assumptions are made when determining statistical significance?
Sampling distribution has to be normally distributed.
When do we use Spearman’s correlation?
If the relationship is not linear and it can be used where one or both variables are ordinal.
What is the difference between the Spearman’s Correlation and Pearson’s Correlation?
Calculation is based on ranks rather than the actual values.
List a weakness of correlation.
Correlation does not imply causation, but it is one important aspect of inferring causality.
List Mill’s 3 conditions for causality.
- X must be correlated with Y
- X must precede Y
- No plausible alternative explanations for Y