Week 2- lecture notes (correlations pt2) Flashcards

1
Q

Why do we need to have correlation assumptions

A
  • To conduct a correlation analysis, data must meet pre-specified assumptions
  • If any of these are violated, Pearson’s R may provide misleading info regarding relationship between 2 variables of interest
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Correlation assumptions- types of variables

A
  • Correlation describes the relationship between equal interval numeric variables
  • therefore… Variations in X and Y should relate to the variation in the magnitude of the variables and not variations between different categories
    -e.g. you should not try to correlate a continuous variable (mean smelliness) with a categorical variable (type of cheese)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Correlation assumptions- missing data

A
  • Is there a data point for each participant on both variables?(
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Correlation assumptions- normality

A
  • Data should be normally distributed
    > If either of the distributions (X or Y) are not normal, the correlation may be distorted
    > You can do a visual check of this by looking at a histogram
    > can also use a qqplot- plots things in quantiles and checks that whatever you’re plotting comes from the same distribution- if from normal distribution of data should fall close to the line. As long as falls between striped lines around it you can normally say that it’s normally distributed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Correlation assumptions- linearity

A
  • Correlation analysis assumes that the relationship between X and Y is linear
    > (this doesn’t mean that all points need to fall on the straight line, rather, the general trend should be described by a straight line e.g. positive/ negative relationship
  • However, correlation analysis cannot provide a full picture of curvilinear relations ((E.g. relationship varies in different aspects of language))
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Correlation assumptions- homoscedasticity

A

homoscedasticity- no discernible pattern
heteroscedasticity- bow tie shape
heteroscedasticity- fan shape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Correlation issues- dealing with normal distribution problems

A
  • When dealing with data which not normally distributed, a non-parametric test correlation coefficient can be used (Spearman’s Rho)
    -It’s based on the ranking of scores (lowest -> highest)
    > Ranks all scores on variable x and ranks from lowest to highest, might score differently on x than y but normally pretty similar
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Correlation issues- outliers

A
  • As with non-parametric data, correlation may also be distorted by outliers with extreme scores (usually more than 3SD’s away from the mean
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Correlation issues- range restrictions

A
  • The sample used may not represent the true variation present in the two variables present in the population
    >Also… if one range on one variable is unusually large (and there are two very distinct clusters) it is sometimes more beneficial to create a new variable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Intercorrelations

A

-What if you are interested in the association between more than just two variables
o Might also be interested in how each of those 3 variables are interlinked with one another- actually 6 correlation coefficients that you are really interested in
o Can calculate pearsons r for each of those relationships by constructing an APA correlation matrix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Intercorrelation- type one error

A
  • Conducting many correlations at once increases the chance of a type 1 error. We therefore need to apply a Bonferroni adjustment which changes the significance level of p
    > Bonferroni adjustment- the significance level is adjusted by dividing the normal significance value by number of tests performed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Which of the statements below are true of correlation assumptions? (Pearson’s)- quiz

A

Variables are continuous and numerical in nature
variables are at the interval level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

A curvilinear relationship can be problematic for correlation analysis. This relationship increases the likelihood of…

A

Type 2 error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

which of the below statements is true of Spearmans rho

A

A test of correlation for non-parametric data

A test of correlation on categorical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

A researcher is interested in the relationship between type of alcohol consumed (beer, wine, spirits) and mean score on WBA’s. What type of correlation analysis would be best suited to this?

A

Spearman’s Rho

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In R, which function would you use if want to include only certain variables (columns) present in your data?

A

select()

17
Q

In R, which function would you use if you want to create a new variable (column) in your data?

A

mutate