quiz 3 Flashcards
what is correlation
- Each individual is measured on two variables (X, Y)
- We are interested in exploring the relationship between scores on X and scores on Y
what is a bivariate scatterplot
plot of the data on the axis, we do not know if there is a correlation yet
explain 0 correlation, positive and negative
- If there is no relationship between X and Y, the correlation is 0
- If higher scores on X are associated with higher scores on Y, the correlation is positive
- If higher scores on X are associated with lower scores on Y, the correlation is negative
what is Pearson Correlation
- Statistic that allows us to express the relationship between X and Y (r)
- May take on values ONLY between -1 and +1
- If there is no correlation between X and Y, then r = 0
- If there is a positive correlation between X and Y, then r will be between 0 and +1
- If there is a negative correlation between X and Y, then r will be between -1 and 0.
what are the two parts of a Pearson Correlation and what do they tell us
- The sign (+/-) tells us whether the correlation is positive or negative
- The magnitude (absolute value) tells us the strength of the relationship
the closer a magnitude is to 1 means what?
the stronger the relationship.
explain Perfect Positive Correlation
- this does not happen, statistically cannot happen)
- r = +1.00
- Perfect Negative Correlation (r = -1.00)
what is significance testing looking at
- We want to make inferences to the whole population based on a sample selected from the population.
- Sampling error will always be involved.
- We might find a positive correlation in our sample, but how do we know that the variables are actually correlated in the population?
- How likely is it that I will make an error by claiming that the two variables are correlated in the population?
what does significance testing use
p-value
what does p-value tell us
- Tells us the chance that we will be WRONG if we conclude that there is a correlation between the two variables in the population
- p = .04 means that there is a 4% chance that we will make an error if we conclude that the two variables are in fact correlated in the population
- Convention: p ≤ .05 is considered “statistically significant”
- 5% chance or less that you’re wrong
what are three important considerations for correlations
- Shape of the relationship
- Homoscedasticity
- Restriction of range
what does the shape of the relationship mean
- Pearson r applies only if the relationship between the variables is presumed to be linear.
- Whatever the connection is, we are assuming that one thing is directly impacting the other
-Curvilinear relationships cannot be described by Pearson r
explain homoscedasticity and heteroscedasticity
Homoscedasticity = all data points fall within a (more or less) elliptical/oval shape; range of values on Y are same for each value of X
Heteroscedasticity = shape of data points deviates from ellipse (e.g., fan shaped); range of values on Y are NOT the same for each value of X
-We do not use Pearson’s R because it returns a result of 0, and this is not true because there is a relationship (same is true for curvilinear relationships)
visual difference between homo and heteroscedasticity
see ppt slides
explain restricted range
- Common reason why population correlation coefficients can be underestimated by sample r’s
- If your sample is in a restricted range, makes you conclude there is no relationship when there actually might
- The effect of a restricted range is to reduce the magnitude of the calculated r.
what two variables does Pearson r measure
variable 1: interval or ratio
variable 2: interval or ratio
what two variables does Spearman rho measure
variable 1: ordinal (ranks)
Variable 2: ordinal (ranks)