Chapter 8: Bivariate Correlational Research Flashcards
Bivariate correlation
An association that involves exactly two variables. AKA bivariate association.
Categorical variables
Values fall into categories (qualitative/nominal).
Quantitative variables
Range of values (ordinal, interval, ratio).
Scatterplot
Best way to represent the correlation/association between two quantitative variables. Indicates the strength and direction (+/- or 0) of the relationship, represented by correlation coefficient “r”.
Bar graph
Best way to represent the correlation/association between two categorical variables. Bars represent group averages that allow you to examine the difference between groups.
Correlation/association
The study involves measuring both variables.
Examine relationship between categorical and quantitative variables
Can be examined using a t-test or ANOVA, depending on the # of categories.
Tip for reading correlation statistics
Go down the diagonal and pick either the upper righthand corner or bottom lefthand corner - the data is duplicated.
Examine relationship between two categorical variables
Use Chi-square. Will yield cross-tabulation and Chi-square tests in SPSS.
Conclusion basis
Study design, not statistical analysis!!
Primary validities for association claims
Construct validity: How well was each variable measured?
Statistical validity: How well do the data support the conclusions?
Also, might ask about external validity: Who do the results apply to?
Construct validity questions
- Operationalization: How was it measured?
- Reliability questions: Test-retest reliability, internal reliability, inter-rater reliability.
- Measurement validity questions: Face validity/content validity, predictive/concurrent validity, convergent validity, discriminant validity.
Statistical validity questions
- How strong is the relationship?
- How precise is the estimate?
- Has it been replicated?
- Could outliers be affecting the association?
- Is there restriction of range?
- Is the association curvilinear?
Examining relationship strength
Use Cohen’s guidelines for r.
2 associations may be statistically significant, but may differ in the strength of the relationship.
r = .12, p = .04 - SMALL
r = .35, p = .03 - MEDIUM
r = .67, p =.01 - LARGE
Larger effect sizes, if everything else is equal, are usually more important. But it depends on the context.
Effect size
The strength of a relationship between two or more variables.
How strong is the relationship?
- All else being equal, large effect sizes are more important.
- Small effect sizes can compound over many observations.
- Effect sizes also allow you to compare associations to each other with benchmarks.
How precise is the estimate?
A correlation coefficient is a point estimate of the true correlation in the population.
We use confidence intervals and p values to communicate precision.
p: Likelihood of getting a correlation of that size just by chance, assuming there is no correlation in the real world.
A larger effect size is more likely to be statistically significant (but doesn’t guarantee it).
A small sample is more easily affected by chance events.
A small correlation (r= .08) in a large sample (n = 1,000), may be statistically significant (p=.04).
A large correlation (r = . 80) in a small sample (n = 10), may be statistically non-significant (p = .36).
95% CI does not contain zero
Statistically significant - the correlation is unlikely to have come from a population in which the association is zero.
95% CI that does contain zero
Not statistically significant - we can’t rule out that the true association is zero.
Replication
The process of conducting a study again to test whether the result is consistent. Assists in supporting a study’s statistical validity.
Outlier
An extreme score that can have a very strong impact on correlation coefficients. Particularly problematic in small samples.
Look at scatterplot to identify outlier(s).
Restriction of range
Lack a full range of scores. Makes correlation appear smaller by underestimating the true correlation (decreases statistical validity). Can apply when a variable has very little variance.
Curvilinear association
The relationship between two variables is not a straight line; it might be positive up to a point and then become negative (or vice versa). Sometimes when the results suggest there is no relationship, there is in fact a curvilinear relationship.
Example: r = .01
Internal validity
This type of validity is less relevant for association claims. However, it is still good to check this out to make sure you don’t wrongly assume there is a causal relationship.
When is the 3rd variable a problem?
In a correlational study, the existence of a plausible alternative for the association between two variables. The 3rd variable makes it look like there is a bivariate relationship between two variables because of its presence in the data. Look at the pattern of data to determine if the relationship still exists for each group separately!
External Validity
How did they get their sample? Matters less than construct and statistical validity. It’s not about the size of the sample: It’s how the sample was recruited and collected. Moderator variables address external validity.
How important is external validity?
If a bivariate correlational study fails to use random sampling, it should not lead us to automatically reject the findings.
Many associations generalize to samples that are very different from the original sample. For example, older adults (Os) might score lower on both variables than college students (Ys), but the same association might still exist within each age group. So a study including young people will yield the same finding as a study using older people ie generalize to other populations.
Moderation
The relationship between two variables changes depending on the level of another variable.
Difference between moderators and third variables
When there is a moderator, the relationship between A and B is simply different at the levels of the new variable (e.g., the relationship between attendance and wins differs depending on residential mobility).
When there is a third variable problem, the relationship between A and B is only there because both A and B are related to a third variable (both well-being and substantive conversations are related to education). It’s a spurious relationship.
Directionality problem
In a correlational study, the occurrence of both variables being measured around the same time, making it unclear which variable in the association came first. (issue of temporal precedence)
Spurious relationship
A bivariate association where the original association is not present within the subgroups. (third variable problem)