Introduction to Correlation (scatter plot; Pearson’s r; non-linearity; outliers; Spearman’s rS; range restriction; correlation as effect size) Flashcards
What are two designs for research studies?
- Compare groups against each other (e.g. experimental vs. control).
- Study two continuous variables in the same people
(correlational design)
What is the typical analysis of Compare groups research design?
In which direction and by how much do group means differ? (Express difference between means as Cohen’s d and/ or in original units. )
What is the typical analysis of Correlation research design?
Typical analysis: Describe direction and strength of
relationship (correlation).
What is one way we describe the relationship between data?
Scatter plots:
- each dot represents a data point
- y and x axis are arbitrary
What is a positive relationship?
It is when high score of a variable are followed by high score of the other
How can you assess the strength of your data on scatter plots?
- The more “cloudy” or round the shape of the scatter plot is, the weaker it is.
- The steeper the data points are scattered around, the strongest the correlation is
How can we measure the strength of data on a scatter plot ?
Properties of r:
- Correlation coefficient r (“Pearson’s r”) measures strength of correlation (for interval or ratio scale data).
What are the characteristics of Coefficient r?
Ranges from -1 to 1.
Sign of r indicates direction of relationship: If you know
participant’s score on one variable you also know score on other variable.
- Positive sign (‘positive relationship’): highs tend to go with highs and lows tend to go with lows
- Negative sign (‘negative relationship’): highs tend to go with lows and lows tend to go with highs.
- r = 0: Variables are unrelated. Knowledge of participant’s score on one variable does not help guessing score on other variable
r is independent of unit of measurement (e.g. weight
measured in kg or lbs).
What does the absolute magnitude of r indicate about the relationship?
The absolute magnitude of r indicates strength of
relationship.
What can we use to analyse the rank pf the data when outliers are present?
Because outliers strongly impact r, we use Spearman’s 𝜌𝜌 (“rho”, rS)
How can we deal with outliers?
- determine if score is faulty. If it is, correct the score if
possible; otherwise discard score from analysis. - Otherwise use rS instead of r
-compute and report r for whole sample and after
exclusion of outlier. (Discuss which result appears more meaningful.)
How does correlation data range change the strength of correlation?
Strength of correlation depends on variability in scores.
What happens if the population is restricted?
- range restriction will reduce r.
- we will usually chose an available population to reduce the restriction of r
How can we transform Cohen’s d and Coefficient r?
They are both effect size samples
r = √d^2 / (d^2 + 4) d = √4r^2 / (1 - r^2)
What is the link between d and r for studies?
In order for 2 studies to have the same effect strength, we need to have d double the size of r