Wk 11 - Correlation Flashcards
What is covariance? (x2)
Calculated by.. (x2)
Measure of the linear relationship between two variables that tells you how strong the relationship is, and its direction
Add the products of all x and y deviation scores, divide by N - 1
How does covariance relate to correlation? (x1)
Correlation is the standardised form of covariance
Why do we need to test r for significance? (x1)
So that we can use our knowledge of the relationship between two variables within a sample to make inferences about the population
What does r-squared mean? (x3)
It’s the coefficient of determination -
The proportion of shared variability
The proportion of variance in one variable that is explained by the variance in the other
What does k-squared mean? (x3)
It’s the coefficient of non-determination, or the error/residual variance -
The proportion of variance that cannot be predicted from the other
What does the point-biserial correlation test? (x1)
Calculate by… (x5)
The relationship one dichotomous variable and one continuous
Scoring one of the dichotomous as 1, and the other as 0
Then as for Pearsons:
Calculate deviations, squared deviations and products of deviations for x and y
Substitute into formulas
Then test for significance
What is Spearman’s rank correlation (rho) the non-parametric equivalent of
What data can it test? (x4)
Calculate by… (x1)
Pearson’s r
Interval/ratio that is: naturally ranked, skewed/outliers, monotonic relationships (not a constant rate of increase)
As for Pearson’s, but use the ranks as the x and y scores, not the raw data
Why is it not always possible to compare groups/levels of IV? (x3)
Ethical issues, eg smoking relationship to general fitness
Credibility issues, eg relationship of extraversion to use of social media (random hi or lo extraversion groups?)
Practical issues, of measurement
What is correlation? (x1)
The degree of correspondence between 2 variables
What is the criterion? (x1)
And what is the best predictor of? (x1)
It’s the variable you’re trying to predict, the y-axis
The mean of the criterion
A positive correlation is when… (x1)
Higher scores on one variable associated with high on the other
A negative correlation is when… (x1)
Higher scores on one variable associated with lower scores on the other
What is the disadvantage of measuring association with covariance? (x1)
How to remedy? (x1)
Its affected by scale - using cm or inches will change it
Standardise by transforming to r
Explain in words the calculation of Pearson’s correlation/zero-order correlation (x2)
And it’s advantage over covariance is? (x1)
Divide covariance (the degree to which x and y vary together), by SD of x times SD of y (degree to which they vary separately) It is comparable over studies/scales
How do you calculate the standard deviation of of a set of scores? (x5)
Subtract the mean from each raw score Square the result Sum all of those Divide by N - 1 Take the root of that
How do you calculate a z-score? (x2)
Subtract the mean from the score
Divide by the SD
What 3 methods can be used to calculate Pearsons r?
And how do you choose?
Through the covariance divided by the product of standard deviations of x and y (conceptual formula)
Dividing the sum of z-scores (x and y) by N - 1 (standardised formula)
Dividing the sums of products -(X - Xbar)(Y - Ybar), totalled - by the root of SSx times SSy
Depends whether you are given SD, z-scores, or raw data in question
What is the issue regarding interpretation of the significance of r?
sig of r only tells us if a relationship is likely to have occurred by chance – doesn’t tell you how big the relationship is
How do you test r for significance? (x1)
Steps in calculation… (x2)
Use a t-calculation
Calculate given formula for t using r and r-square
Find df at N - 2
What alternative calculation could you do for a point-biserial correlation, and why?
Independent groups t-test -
Use each level of the dichotomous variable as one group
Significane for r-pb and t-test will be identical
What are the assumptions of the Spearman rank-order correlation?
Interval or ration data
Normal dist of X and Y, and
Linear relationship between X and Y
And so we assume a bivariate normal dist based on interval or ratio scales