Correlation Flashcards
Cross-product deviations
Multiply deviations of one variable from mean by corresponding deviations of 2nd variable. Total relationship btwn 2 variables.
Covariance
How much variables vary together…measures average relationship btwn 2 variables. Not a standardised measure
Negative covariance =
One variable deviates from the mean (increases) and the other (decreases)
Positive covariance
Variables deviate from mean in same direction
Covariance standardisation
Convert covariance into standard set of units. divide by SD for each variable
Pearsons correlation coefficient r
Covariance divided by multiplied SDs. (Standardised covariance) Sampling distribution that’s not normally distributed.
If you find a correlation coefficient less than -1 or more than +1 you can be sure that…
Something has gone hideously WRONG
Coefficient of +1 means
Two variables are perfectly positively correlated. Both increase or decrease together
Coefficient of -1 means
Perfect negative relationship. One decreases as one increases
Coefficient of 0 means
No linear relationship. If one moves the other stays the same
Coefficient of +1 means
Small effect
+3 coefficient means
Medium effect
+5 coefficient means
Large effect
Bivariate correlation
Correlation btwn 2 variables
Confidence intervals for r
Convert r to z scores (make sample distribution normal). Construct normal way.
Correlation coefficients say nothing about…
Direction of causality
Coefficient of determination
Correlation coefficient squared x 100 = %. Measure of amount of variability in one variable that is shared by the other.
Spearman’s correlation coefficient
Non parametric stat based on ranked ordinal data. Useful to minimise effects if extreme scores or effects of violations of assumptions. 1st rank data then apply pearsons equation to ranks.
Kendalls tau (non parametric)
Used with small data set with large number of tied ranks. More accurate than Spearman’s at correlation within the population
Biserial and point-biserial correlations
Used when one of two variables has two categories (pregnant or not preg)
Point-biserial correlation
One variable is discrete (pregnancy - are or aren’t)
Biserial correlation
One variable is continuos (passing or failing an exam - on a continuum)
Partial correlation
Correlation btwn two variables in which effects of other variables are held constant. Measures unique relationship btwn all 3 variables. One variable must be controlled for.
Dichotomous variable
One that categories are discrete
Partial correlations used for…
Looking at unique relationship btwn 2 variables when other variables are ruled out.
Correlation relationships displayed using a
Scatter plot
Scatter plots work best with
Interval or ratio measures
Accuracy of correlation is contingent on several assumptions:
Random sampling (>30) Linear relationships Relationship homoscedastic Restrict the range Outliers need to be fixed
Homoscedastic is
At any point along the way of any predictor variable the spread should be fairly constant
Heteroscedasticity is
Opposite of homoscedastic - not constant placed
Chi square measures
The expected frequencies under the null with the actual observed frequencies (assesses goodness of fit)