Week 3 Correlation and Regression Flashcards
What is correlation
-Correlation deals with the association between two variables and is one of the most important data analytic techniques in psychology
•Correlation is a form of bivariate analysis
-Correlation quantifies a linear relationship between two variables X and Y in terms of
•Direction
•Degree
Why is correlation important
- One of the most frequently used statistics, especially in social psychology. Also in other field such as medical studies
- Building block for more sophisticated methods
Understanding correlational information
•Correlation can be positive or negative
•An association between two variables can be linear or non-linear
•Correlation coefficients (r) range from -1 to 1
–Correlation of zero indicates no association between the variables
Direction of correlational information
-Positive relationship: increases in X accompanied by increases in Y
-Negative relationship: increase in X accompanied by decrease in Y
No relationship: Knowing something about one variable tells you nothing about the other variable
Form of the Relationship of correlation
- Correlation measures the linear relationship between two variables.
- If there is a nonlinear relationship, the correlation value may be deceptive.
- If the two variables are independent of one another,the correlation will be approximately zero.
Degree of relationship
-perfect linear relation: every change in the X variable is accompanied by corresponding change in the Y variable
•Rough rules for thumb on how big/small correlations are
–Small effect: .1 < r < .3 or -.3 > r > -.1
–Medium effect: .3 < r < .5 or -.3 > r > -.5
–Large effect: .5 < r < .7 or -.5 > r > -.7
however, this is not formal and context does matter
Pearson correlation coefficient
The Pearson correlation coefficient (r) is most commonly used in psychology and measures the linear association between two continuous variables.
ØIt compares how much the two variables very together to how much they vary separately.
Variability vs. Coverability
Variability: how much a given variable varies from observation to observation
Coverability: how much two variables vary together
Correlation and Causality
Most important lesson of the day: correlation does not
imply causation!
–A significant correlation does not mean that one variable causes the other
Correlation and Causality
Most important lesson of the day: correlation does not
imply causation!
–A significant correlation does not mean that one variable causes the other
Extreme Scores effect on corellation
Extreme scores or outliers can greatly influence
the value of a correlation
Regression Toward the Mean
- With imperfect correlation, an extreme score on one measure tends to be followed by a less extreme score on the other measure
- Extreme scores are often (but not always) due to chance
- If it’s due to chance, it’s extremely unlikely that the other value will also be extreme
Null Hypothesis testing for corellation
• The null hypothesis for correlation is: The correlation in the population is zero.
• If the probability associated with this null hypothesis is small (p < .05), then we reject the null hypothesis.
• So, we infer that the correlation value for the population is NOT zero
• There is a significant association between the two variables.
For correlation df= n– 2.
Null hypothesis significance testing
- Measure the variables for participants in the sample, calculate the correlation r
- What is the probability of finding an r this big, if the real association in the population (ρ) is zero?
- If this probability is small (p<.05), our initial assumption is in doubt.
- We REJECT the null hypothesis.
Spearman correlation coefficient
- The spearman correlation coefficient (rs) may be used when the data is ordinal (ranked).
- And when the data is one-directional but not linear.
- Convert the data to ranks before calculating correlations, which can linearize nonlinear data