Correlation Redux Flashcards
what is Pearsonâs r?
tells you the relationship between two variables: strength (âsize of effectâ) or direction [+ or -]
what the p value tell us?
if itâs sig diff from 0
what is a âsmall effectâ size?
± 0.1
what is a âmedium effectâ size?
± 0.3
what is a âlarge effectâ size?
± 0.5
what is significance?
testing the null hypothesis that r = 0 (correlation value is zero)
when can we reject the null hypothesis? / suggest thereâs a significant different
when p < 0.05 we can reject the null hypothesis
what is a positive correlation?
both variables increase together
the interpretation of correlation and causality:
- correlation = not sufficient evidence for causality between variables
- does not imply causation
- gives no indication of the direct of causality
what is a negative correlation?
one goes up, the other goes down
what does a correlation coefficient give you?
- How much the two variables vary together
- The further the score from zero, the stronger the relationship
Why canât we infer causation from a correlation?
The relation between the two variables is often due to the eachâs variables relation to the third
what can happen if we take account of this relationship with the third variable (âcontrol forâ)?
the original relatonship disappears as it was spurious (not genuine)
what may the third variable be?
a cofounding variable
how can we examine third variables?
partial correlations
what are some issues with correlation?
- shape of the relationship
- outlier
- restricted ranges
- sample size
- reliability of measures
what should relationships be like?
linear relationship (similar to regression)
what does Pearsonâs Correlation Coefficient measure?
linear relationships
what doesnât Pearsonâs Correlation Coefficient measure?
non-linear relationships
* correlations be non-linear but Pearsonâs r can not pick this up/do anything about it
non-linear correlations
will reduce the correlation
* still a relationship, just a different shape
Outliers
Individual scores can reduce or enhance the ârâ
* i.e. if you add a pair of scores -> âno relationshipâ is turned in a small/medium relationship by one data point
* i.e. if you add a different pair of scores to the data set -> a weak but significant relationship may be removed by one data point
outliers can cut both ways, what does this mean?
can artificially enhance the relationship or reduce it
what do we do about outliers?
we tend to remove them from the data set
how can we fixed stunted relationship?
by having a wide range of scores on all variables to get a clear view of the relationship between your variables