Correlation Module Flashcards
What is the difference between univariate and bivariate?
one variable and two variables respectively
One of the methods that can be used to examine the correlation between two variables
Pearson’s R
We can only use Pearson’s R when the data is _____
bivariate and continuous
Does swapping the X and Y axis affect the value of the Pearson’s R and what is the name of the relationship
No, swapping the x and y axis does not affect the r value as pearson’s r is bi-directional (i.e., correlation between A = B and correlation between B = A)
symmetrical relationship
Explain positive, negative and no correlation linear
positive
r = between 0 - 1
the smaller the x value is, the smaller the y value is
X increases, Y also increases
negative
r = between 0 - -1
the smaller the x value is, the larger the y value is
x increases, y decreases
no correlation
r = 0
there is no association between x and y
x and y is independent
What is the step after obtaining a r value
perform a null hypothesis significance testing to test whether the r value is consistent with the result of the NHST
assuming to rho (population correlation coefficient) is 0 (H0) > if H0 is rejected > H1 (there is a correlation between the two variables)
What are the rough rules of the size of correlation, and are they universal?
.1 - .3 (small)
.3 - .5 (medium)
.5 - .7 (large)
However, they are not universal as the size of the correlation also depends on the context
What are the three things that should be careful when looking at correlation?
correlation does not necessarily mean causation
extreme score (outliers) can affect the effect size both negatively and positively
regression towards the mean (the tendency that the next point will be at the different direction as the previous point)
What are variability and covariability
variability = how much each variable varies observation by observation
covariability = how much do the two variables vary together
what are the three things that have to bear in mind when looking at correlation
correlation doesn’t mean causation
extreme score (outliers) can affect the accuracy of the model
regression towards the mean
What are the three characteristics of residuals
independent
normality: (errors are normally distributed with a mean of 0)
homoscedasticity: all values are equal