Correlation Flashcards
What is a scattergram?
Values on one variable are plotted against the values of another variable
Regression line: line diagonal through the points plotted
What is a contingency or a crosstabulation table?
Frequency tables for nominal categories
Too many categories can make the table too big, so focus on a few categories
What if scattergram values overlap?
If several points overlap, a number may appear stating the amount of plots overlapped
Sunflowers: lines are put around the point, the number of ‘petals’ indicates the number of values overlapped
What is a compound bar chart?
Can be used instead of crosstabulation tables or contingency tables
Frequencies of nominal data
Focus on a few categories again, too many categories makes it unmanageable and difficult to decipher
What is a compound histogram?
For nominal data again
Few categories
Small ranges
What do you need for a correlational study?
Many participants
At least 2 variables
Both variables are continuous
Can be dichotomous
Why are correlational studies used?
Correlations show direction and strength
Test for statistical significance
Show us the relationship, not the cause and effect
How do you create a scatterplot? What are they used for?
Scatterplots are quick at allowing us to evaluate whether there is a correlation between variables
Ps are measured on both variables
Every dot is one participant
X is one variable
Y is the other
What are positive and negative correlations?
Positive: Variables change in the same direction, right leaning
Negative: variables change in opposite directions, left leaning
No correlation: No systematic relationship
What are weak and strong correlations?
Weak correlation: no relationship between variables
Can have medium/moderate correlations
Strong relationship: almost perfect relationship
Why are correlation coefficients used?
Express correlations numerically
Quantifies the strength and direction of the correlation
The sign of the correlation coefficient tells us the direction (negative number: negative correlation, positive number: positive correlation)
The value tells us the strength (closer to 0= weaker correlation, +1= stronger correlation)
Pearson’s R mostly used for this
Lower case r= symbol for correlation coefficient
What’s another way to find the strength of correlations?
Effect size
Small around + or - .1
Medium around + or - .3
Large around + or - .5
Large effect sizes are rare in psychology
Small sizes are still important though
What about non linear relationships?
Correlation coefficients can only be calculated for linear relationships
They can’t be calculated for non-linear relationships
What about outliers on a scatterplot?
Outliers that lie on the regression line means the correlation is stronger
If an outlier is unusual, and isn’t near the regression line, they will weaken the correlation. We remove them to get a stronger correlation
Spearman’s Row can be used to eliminate the influence of extreme values
What are clusters of data?
Some ps can form clusters of data
These clusters can have weak, negative correlations
If you ignore that they’re clusters, we get a large positive correlation, this is known as Simpson’s Paradox. To improve this, we must report correlation coefficients for each cluster and one overall