4 - Correlation Flashcards
What is correlation?
A way of measuring the extent to which 2 variables are related - a change in one variable results in a change in another.
What is covariance?
Measures the direction of the relationship between 2 variables.
What is the variance?
A measure of how much scores deviate from the mean for a single variable.
What is the difference between variance and covariance?
Variance gives the measure of how much scores deviate from the mean of a single variable. Covariance tells us how much scores on 2 variables differ from their respective means
What is a major problem faced when using covariance and how can we solve it?
Problem: covariance depends on the unit of measurement
Solution: standardize it by dividing by the standard deviations of both variables
What is the correlation coefficient?
The standardized version of the covariance.
AKA. Pearson’s Correlation Coefficient
What is the range of values for correlation?
[-1, 1] where 0 indicates no relationship
What are the effect sizes associated with correlation?
0.1 is a small effect.
0.3 is a medium effect.
0.5 is a large effect.
What is the function cor.test() used for?
To test the null hypothesis that the correlation in the population is 0.
What is the third-variable problem?
In any correlation, the causality between 2 variables cannot be assumed because there may be other measured or unmeasured variables affecting the results.
Do correlation coefficients say anything about which variable causes a change in the other?
No. Correlation is different than causation.
What are the assumptions of Pearson’s Correlation?
Data is on an interval or ratio scale.
Data is normally distributed
What alternatives are there for Pearson’s Correlation Coefficient, and when are they used?
Spearman’s rho and Kendall’s tau are used when the data is non-parametric.
What is a partial correlation?
A measure of the relationship between 2 variables, controlling for the effect that a third variable has on the other 2.
What is a semi-partial correlation?
A measure of the relationship between 2 variables, controlling for the effect that a third variable has on only one of these.