Correlation Flashcards
What is the Pearson product-moment correlation coefficient?
What symbol is it represented by?
It looks at whether variables deviate from the mean in a similar way (co-vary)
BUT it is on a standardized scale to facilitate comparisons across units of measurement
denoted by “r”
What type of data can the Pearson correlation coefficient (or its adaptations) be used for?
Typically associated with looking at the relationship between variables measured on an interval or ratio scale
HOWEVER it can also be used for:
1. Ordinal variables when the raw scores have been converted to ranks
2. An interval or ratio scale variable with a dichotomous variable
3. Two dichotomous variables
Note that these have become adaptations with difference names
List all types of variable combinations that adaptations of Pearson correlation coefficients can be used for
- Interval/Ratio + Interval/Ratio
- Ordinal (ranked) + Ordinal (ranked)
- Nominal (dichotomous) + Nominal (dichotomous)
- Nominal (dichotomous) + Interval/Ratio
What are the names of the simplified Pearson correlation coefficient formulas?
- Phi
- Spearman
- Point-biserial correlation
- Pearson correlation
What type of data is the Phi correlation coefficient used for?
Nominal (dichotomous) + Nominal (dichotomous)
What type of data is the Point-biserial correlation coefficient used for?
Nominal (dichotomous) + Interval/Ratio
What type of data is the Spearman correlation coefficient used for?
Ordinal (ranked) + Ordinal (ranked)
What type of data is the Pearson correlation coefficient used for?
Interval/Ratio + Interval/Ratio
What is the Spearman correlation coefficient denoted by?
p or r subscript s
When ranking data for a spearmen correlation coefficient - what do you do if two have the same value?
You give both scores the same averaged rank
e.g. if those values would be rank 5 and 6 (but are the same so it wouldn’t be possible order them) you would take the average of 5 and 6 (5.5) and assign both of the values the rank of 5.5
How do you interpret a correlation coefficient?
The correlation coefficient has to lie between -1 and +1
A coefficient of +1 indicates a perfect positive relationship, a coefficient of -1 indicates a perfect negative relationship, a coefficient of 0 indicates no linear relationship at all
When used as a measure of effect size, what would small medium and large effect size of a correlation coefficient be?
The correlation coefficient is a commonly used measure of the size of an effect: values of ±.1 represent a small effect, ±.3 is a medium effect and ±.5 is a large effect
What type of correlation coefficient can be used for non-parametric data?
Spearman
Kendall’s tau
Both used for ranked ordinal data
When should Kendall’s tau be used instead of Spearman?
When you have a small data set and many of the scores in your dataset have the same rank
What does the Kendall’s tau correlation coefficient compare?
It looks at the number of concordant and discordant pairs
What does the biserial correlation coefficient do?
Estimates the degree of association between dichotomized continuous measure and a continuous measure
What is the difference between the biserial correlation coefficient and the point-biserial correlation coefficient?
Give 2 examples
Biserial is used if the dichotomous variable has an underlying continuum between categories
e. g., of a dichotomous variable with underlying continuum passing or failing an exam -> some people might fail by a little bit vs a lot
e. g. of a dichotomous variable without an underlying continuum is being alive or dead (can’t be a bit dead)
How does the interpretation of the biserial correlation coefficient differ from that of the point-biserial correlation coefficient
Interpretation of biserial is slightly different because it is an estimate of of the degree of association of the artificially dichotomized continuous variable if it had been evaluated as a true continuous variable
What is the rank biserial correlation coefficient used for?
what is a caution for interpretation
To estimate the degree of association between a dichotomized continuous variable with an ordinal (ranked) variable
same caution as biserial -> estimate of the association had the artificially dichotomized variable been examined as continuous
What is the phi-biserial correlation used for?
used to examine the association between a true dichotomous variable and a continuous variable that has been dichotomized
What is the tetrachoric correlation used for?
to examine the association between 2 artificially dichotomized continuous variables
What are 2 other types of correlation coefficients that are often seen in structural equation modeling?
Polyserial correlation
Polychoric correlation
What is the Polyserial correlation used for
interval-scaled variable and continuous variable that has been dichotomized or categorized (ordinal)
What is the Polychoric correlation used for?
two continuous variables that have been dichotomized or categorized