LEC 10 Correlation Flashcards
Correlation definition (3)
- the quantification of the degree to which 2 random variables (continuous or ordinal) are related
- variables must be numerical
- provided that the relationship is linear
(check scatter plot to check for potential linear relationship)
Scatter plot
- plot y against x
- useful for visually examining whether a relationship exists between 2 numerical variables
Correlation coefficient
- quantitative measure of the strength and direction of a linear relationship between 2 variables
Types of correlation coefficient (2)
& the type of data analysed
- Pearson product-moment correlation coefficient
- continuous normally distributed variables
- r - Spearman rank correlation coefficient
- continuous non-normally distributed variables
- ordinal
- less sensitive to outlying values as it uses rank > definite values
- rs
Are r and rs dimensionless?
Yes, no units
Range of possible r and rs values
-1 to 1
What does the sign of r and rs indicates?
The direction of the linear relationship between the 2 variables
What does the magnitude of r and rs indicates?
The strength of the linear relationship between the 2 variables
<0.5 : weak
0.5-0.7 : strong
>= 0.7 : very strong
If r/rs = 0?
- means no linear correlation
- DOES NOT mean no correlation cos it can be other non-linear correlation (eg curve)
- check scatter plot for relationship
If r/rs = 1
Perfect positive correlation
If r/rs = -1
Perfect negative correlation
OR
Perfect inverse correlation
If r/rs > 0 (positive)
- means positive correlation
- both variables tend to increase together
If r/rs < 0 (negative)
- means negative/inverse correlation
- one variable increases as the other decreases
Potential misuse of the correlation coefficient (3)
- If correlation coefficient = 0, does not mean no relationship, only mean no LINEAR relationship
- If strong correlation coefficient, does not necessarily imply “linearity” as some parts of the graph might be non-linear
- Does not imply causation (cause-and-effect relationship)
Statistical test for correlation if BOTH data is :
- continuous
- normally distributed
Pearson product-moment correlation
(parametric test)
To test the null hypothesis that there is no correlation between the 2 numerical variables