Week 2- lecture notes (correlations pt2) Flashcards
Why do we need to have correlation assumptions
- To conduct a correlation analysis, data must meet pre-specified assumptions
- If any of these are violated, Pearson’s R may provide misleading info regarding relationship between 2 variables of interest
Correlation assumptions- types of variables
- Correlation describes the relationship between equal interval numeric variables
- therefore… Variations in X and Y should relate to the variation in the magnitude of the variables and not variations between different categories
-e.g. you should not try to correlate a continuous variable (mean smelliness) with a categorical variable (type of cheese)
Correlation assumptions- missing data
- Is there a data point for each participant on both variables?(
Correlation assumptions- normality
- Data should be normally distributed
> If either of the distributions (X or Y) are not normal, the correlation may be distorted
> You can do a visual check of this by looking at a histogram
> can also use a qqplot- plots things in quantiles and checks that whatever you’re plotting comes from the same distribution- if from normal distribution of data should fall close to the line. As long as falls between striped lines around it you can normally say that it’s normally distributed
Correlation assumptions- linearity
- Correlation analysis assumes that the relationship between X and Y is linear
> (this doesn’t mean that all points need to fall on the straight line, rather, the general trend should be described by a straight line e.g. positive/ negative relationship - However, correlation analysis cannot provide a full picture of curvilinear relations ((E.g. relationship varies in different aspects of language))
Correlation assumptions- homoscedasticity
homoscedasticity- no discernible pattern
heteroscedasticity- bow tie shape
heteroscedasticity- fan shape
Correlation issues- dealing with normal distribution problems
- When dealing with data which not normally distributed, a non-parametric test correlation coefficient can be used (Spearman’s Rho)
-It’s based on the ranking of scores (lowest -> highest)
> Ranks all scores on variable x and ranks from lowest to highest, might score differently on x than y but normally pretty similar
Correlation issues- outliers
- As with non-parametric data, correlation may also be distorted by outliers with extreme scores (usually more than 3SD’s away from the mean
Correlation issues- range restrictions
- The sample used may not represent the true variation present in the two variables present in the population
>Also… if one range on one variable is unusually large (and there are two very distinct clusters) it is sometimes more beneficial to create a new variable
Intercorrelations
-What if you are interested in the association between more than just two variables
o Might also be interested in how each of those 3 variables are interlinked with one another- actually 6 correlation coefficients that you are really interested in
o Can calculate pearsons r for each of those relationships by constructing an APA correlation matrix
Intercorrelation- type one error
- Conducting many correlations at once increases the chance of a type 1 error. We therefore need to apply a Bonferroni adjustment which changes the significance level of p
> Bonferroni adjustment- the significance level is adjusted by dividing the normal significance value by number of tests performed
Which of the statements below are true of correlation assumptions? (Pearson’s)- quiz
Variables are continuous and numerical in nature
variables are at the interval level
A curvilinear relationship can be problematic for correlation analysis. This relationship increases the likelihood of…
Type 2 error
which of the below statements is true of Spearmans rho
A test of correlation for non-parametric data
A test of correlation on categorical data
A researcher is interested in the relationship between type of alcohol consumed (beer, wine, spirits) and mean score on WBA’s. What type of correlation analysis would be best suited to this?
Spearman’s Rho