Lecture 8 Flashcards
What is correlation?
A measure of the strength of the relationship or association between two variables.
How do we know if a statistical association between X and Y exists?
Variability in one variable leads to (affects, causes, or overlaps with) variability in the second variable.
Therefore, one variable can be used to explain SOME portion of the variability of the other variable.
In other words, having information on one variable decreases the variability of the other variables.
What is a more inefficient way of predicting scores?
The predicted score for the ith person is the mean.
There is a lot of uncertainty with this estimation.
If a relationship between X and Y exists…
we can use the information about X to decrease the uncertainty in our prediction of Yhati.
How do we use the information about X to decrease the uncertainty in our prediction of Ycarroti?
First, we group the Y scores according to the X values.
Next, we could use the mean of the raw scores (Y) each treatment group (X) to establish Yia.
i.e. Use Ybara to establish an estimate of the raw score of a person in that group.
Example
If the average range (variability) in Y given X equals 6, how many points of Y’s variability is NOT attributable to x? Why?
6!
i. e. it’s residual
- if 6 of the 16 points of variability in Y is not due to X, then 10 points in the variabilty in Y MUST be due to X.
i.e. 16-6=10
i.e. variability in Y that is attributable to X = total variability (total variabilty in Y not attributable to X).
Thus, when trying to predice Yi (an individual raw score), if we use our knowledge of X we reduce variability by 10 points.
i.e. we reduce our uncertainty
Give the correlation ratio
Relate this to the example
η² (this is the greek letter eta) = variability in Y common to X / total variability in Y
Variability in Y (sums of squares between) and “common to” means attributable to
In our example:
10/16 = .63 –> Had to use variance instead of range as our measure of variability, η² = .74
Actually, η² = SSbetween/SStotal
Describe!
- Is measure of the strength of the relationship between X & Y.
- Often used after F tests to determine practical significance –> i.e. is a measure of effect size
Give the limitations of η² as a measure of effect size:
- Relaibilty of variables restricts the magnitude of η²;
- The more homogenous the population, the smaller η² –> restriction of range;
- The magnitude of η² is affected by the number of levels of X;
- Does not indicate the form of the relationship;
- It is UNSTABLE –> it varies a lot from sample to sample –> therefore descriptive stat only, not inferential.
Due to the limitations of η², what do we focus on?
The Pearson Product-Moment Correlation Coefficient (r).
What is the Pearson Product-Moment Correlation Coefficient (r)?
Measures the degree direction of a linear relationship or association of two variables.
Give the conceptual formula for r
r = degree to which X & Y vary together/degree to which X & Y vary seperately
i.e.
r = Covariability of X & Y/Variability of X & Y seperately
what does it mean if r (correlation) is big?
It means that most of the variabilty of X & Y is due to how X & Y co-vary.
- If we have a perfect linear relationship, every cange in X is accomplished by a corresponding change in Y (and vice versa)
- If r = 0 (i.e. no relationship), a change in X does not correspond to a predictable change in Y.
i. e. they don’t co-vary
Again: rxy = COVxy/σhatxσhaty
Close up of the numerator of (COVxy(σhatxy):
give equation in words
Sum of the cross-products (SP - Sum of products of deviations) of the deviation scores divided by n-1
In other words, COVxy = average sum of the cross-products of the deviation scores of the 2 variables.
Give the conceptual formula for Sum of products
Sum(Xi-Xbar)(Yi-Ybar) = SumXY - SumXSumY/N