Correlation & Partial correlation (W8)✅ Flashcards
What is meant by bivariate linear correlation?
Examines the relationship between two variables
Relationships vary in:
- Form
* Linear
* Curvilinear - Direction
* Positive
* Negative - Magnitude/strength
* r = - 1 (perfect negative relationship)
* r = +1 (perfect positive relationship)
* r = 0 (no relationship)
What is the hypothesis testing in linear correlation?
- Measure the relationship between two variables in a sample
=> use sample statistics to estimate the population parameters - Always assume null hypothesis is true: there is no relationship between the population variables.
- Once we’ve determined the relationship in our sample, inferential analyses allow us to determine: the CHANCE of measuring a relationship of that magnitude when the null hypothesis is true
-> if p < 0.05 then can reject null hypothesis
What are the main 4 parametric assumption of correlational studies (and 1 thing to look out for)?
- Both continuous variables
-> if one or both is ordinal, use non-parametric alternative - Related pairs: Each participant should have a pair of values (x, y)
- Absence of outliers (can skew correlation if present)
- Linearity: scatterplot should be best explained with a straight line -> NOT curved line
Note: sensitive to range restrictions
=> If seriously violated these assumptions: use non-parametric equivalent (Spearman’s rho)
What is Pearson’s correlation coefficient (r)?
- Long name – Pearson product-moment correlation coefficient (PPMCC)
- Correlation coefficient (r) is a measure of the strength of
association between the two quantitative, continuous variables
What is the relationship between covariance and correlation coefficient (r)?
- Covarience (cov): measure of the variance shared between x and y variables
1. For each datapoint, calculate the difference from the mean of x and y (e.g. xi - xm AND yi - ym)
2. Multiply the difference
3. Sum the multiplied differences
4. Divide by N-1 - Correlation coefficient (r): the ratio of covariance to separate variance (multiply SDs of x and y)
-> r = cov(x,y) / (Sx*Sy) - r reflects how well a straight line fits the datapoint (strength of correlation)
-> If datapoints cluster closely around the line, r is FURTHER from 0 (STRONGER)
What is sampling error in correlational studies?
Difference in r-values if we recruit another sample from the population and measure the correlation strength of it.
r-distribution have a 95% confidence interval
-> there is a 5% chance that the population’s r falls above/below the 95% CI limit
What is meant by shared variance? (r^2)
r^2: expresses the proportion of the separate variances that is shared
NOTE! r = .8 ix 4x as strong as r = .4 (power of squared)
r is another useful measure of effect size -> squared to give a measure of shared variance (similar to partial eta squared)
=> Tell how much the variance in y can be ‘explained by’ x
What is meant by partial correlation?
- Allows us to examine the relationship between two variables, while removing the influence of 3rd variable
- we can control for [3rd variable] by statistical means
How to interpret partial correlation: change in r when control for the 3rd variable?
- If r-value decreases AND is not significant -> suggest the relationship between V1 and V2 may be explained by the influence of V3 on both 1&2.
- If the r-value decreased but remained significant -> suggest that the relationship was partially explained by V3
- If the r-value not decreased & remained significant -> suggest that the relationship was not influenced by V3.