Correlation and Partial correlation Flashcards
What does Bivariate Linear Correlation examine?
Examines the relationship between two variables
Examines the relationship between two variables
Bivariate Linear Correlation
Relationships between two variables may vary in…?
List 3 points
- Form
- Direction
- Magnitude/strength
How can relationships between two variables vary in form?
List 2 ways
- Linear
- Curvilinear
How can relationships between two variables vary in direction?
List 2 ways
- Positive
- Negative
How can relationships between two variables vary in magnitude/strength?
List 3 ways
- r = - 1 (perfect negative relationship)
- r = +1 (perfect positive relationship)
- r = 0 (no relationship)
What is considered a perfect correlation?
+/- 1
What is considered a strong correlation?
+/- 0.9, 0.8, 0.7
What is considered a moderate correlation?
+/- 0.6, 0.5, 0.4
What is considered a weak correlation?
+/- 0.3, 0.2, 0.1
What is considered zero correlation?
0
Linear correlation involves measuring …?
The relationship between two variables measured in a sample
Measuring the relationship between two variables measured in a sample
This is known as…?
Linear correlation
We use ______ to estimate the population parameters
Sample statistics
We use sample statistics to estimate the _______
Population parameters
True or False?
In hypothesis testing for correlation, we always start by assuming the null hypothesis is false
False
We always start by assuming the null hypothesis is true: there is no relationship between the population variables
Once we’ve determined the relationship in our sample, inferential analyses allow us to determine _________ when the null hypothesis is true
The probability of measuring a relationship of that magnitude
What is the chance of measuring a relationship of that magnitude when the null hypothesis is true?
How do we measure this?
p-value
The probability of measuring a relationship of that magnitude when the null hypothesis is true
This is known as…?
p-value
What is a p-value?
The probability of measuring a relationship of that magnitude when the null hypothesis is true
If the probability of measuring a relationship of the obtained magnitude is less than our threshold (0.05), we are prepared to …?
a. reject the null hypothesis
b. fail to reject the null hypothesis
a. reject the null hypothesis
What are the 5 parametric assumptions of correlations?
- Both variables should be continuous (level of measurement)
- Related pairs: Each participant (or observation) should have a pair of values
- Absence of outliers
- Linearity – points in the scatterplot should be best explained with a straight line
- Sensitive to range restrictions
If one or both variables are ordinal, do we use parametric (Pearson’s r) or non-parametric (Spearman’s rho) correlation?
We use a non-parametric alternative (Spearman’s rho)
What do outliers do?
They skew the results of the correlation
True or False?
Points in the scatterplot should be best explained with a curve
False
Points in the scatterplot should be best explained with a straight line
What is the non-parametric equivalent to Pearson’s r correlation?
Spearman’s rho
Spearman’s rho is the non-parametric equivalent to…?
Pearson’s r correlation
What is the non-parametric equivalent to Pearson’s r correlation when there are fewer than 20 cases?
Kendall’s Tau
Kendall’s Tau is the non-parametric equivalent to…?
Pearson’s r correlation when there are fewer than 20 cases
Pearson’s correlation coefficient investigates the relationship between…?
2 quantitative, continuous variables
The resulting correlation coefficient (r) is a measure of …?
The strength of association between the two variables
The strength of association between the two variables is measured by
Pearson’s correlation coefficient (r)
How do we calculate covariance?
List 4 points
- For each datapoint, calculate the difference from the mean of x, and the difference from the mean of y
- Multiply the differences
- Sum the multiplied differences
- Divide by N – 1
What provides a measure of the variance shared
between our x and y variables?
Covariance
Covariance provides a measure of the variance shared between …?
x and y variables
What is a ratio of covariance (shared variance) to separate variances?
Correlation coefficient (r)
What is the correlation coefficient (r)?
A ratio of covariance (shared variance) to separate variances
How can we obtain a measure of separate variances?
Multiplying the standard deviation for x and y
What is the formula for correlation coefficient (r)?
covariance (x, y) / SD x * SD y
If the covariance is large relative to the separate variances, r will be
a. Exactly 0
b. Closer to 0
c. Further from 0
d. Negative
c. Further from 0
If the covariance is small relative to the separate variances, r will be
a. Exactly 0
b. Closer to 0
c. Further from 0
d. Negative
b. Closer to 0
When r is further from 0, this means that
a. The covariance is large relative to the separate variances
b. The covariance is moderate relative to the separate variances
c. The covariance is small relative to the separate variances
d. The covariance is negative relative to the separate variances
a. The covariance is large relative to the separate variances
When r is closer to 0, this means that
a. The covariance is large relative to the separate variances
b. The covariance is moderate relative to the separate variances
c. The covariance is small relative to the separate variances
d. The covariance is negative relative to the separate variances
c. The covariance is small relative to the separate variances
r reflects how well a straight line fits the data points
What does this mean?
r reflects the strength of the correlation
If datapoints cluster closely around the line, r will be …?
a. Further from 0
b. Exactly 0
c. Exactly 1
d. Closer to 0
a. Further from 0
If datapoints are scattered some distance from the line, r will be…?
a. Further from 0
b. Exactly 0
c. Exactly 1
d. Closer to 0
d. Closer to 0
What is the df for r?
df = N - 2
Why is the df for r = N - 2?
Because we estimate two population parameters (mean of x and mean of y, in order to calculate differences from those values)
Should we report N or df when reporting r?
df
How do we report r?
r(df) = r value, p = p-value
True or False?
If we were to calculate the correlation using data measured with another sample from the same population the r-value we would obtain is likely to be the same
False
If we were to calculate the correlation using data measured with another sample from the same population the r-value we would obtain is likely to be different
If we were to calculate the correlation using data measured with another sample from the same population the r-value we would obtain is likely to be different
What does this difference reflect?
Sampling error
Imagine if we obtained r for all possible samples
drawn from the population of interest
What would the mean of the resulting distribution look like?
The mean of the resulting distribution would be
equivalent to the true population correlation
coefficient
The null hypothesis (H0) states that there is no
relationship between the population variables (i.e. r
= 0)
So, under the null hypothesis, the sampling
distribution of correlation coefficients will have a
mean of …?
Zero
The r-distribution has a mean of …?
Zero
The extent to which an individual sampled
correlation coefficient (r) deviates from 0 can be
expressed in…?
Standard error units
Standard error units are used to express…?
The extent to which an individual sampled
correlation coefficient (r) deviates from 0
True or False?
Our obtained r-value is just a point estimate of the underlying population r-value
True
True or False?
Our obtained r-value is not subjected to sampling error
False
Our obtained r-value is subjected to sampling error
What is the formula for shared variance?
Shared variance = r^2
Expresses the proportion of the separate variances that are shared
This is known as…?
Shared variance = r^2
Calculate the shared variance between the two variables when:
r = .8
r = .8
r^2 = .64
Variables share 64% of variance
True or False?
r =.8 is twice as strong as r=.4
False
r =.8 is 4 times as strong as r=.4
Expresses how much variance in our DV could be explained by our manipulation of the IV
This is known as…?
partial n^2
Tells us how much of the variance in y can be ‘explained by’ x
This is known as…?
r
Allows us to examine the relationship between two variables, while removing the influence of a third variable
This is known as…?
Partial correlation
What is partial correlation?
Allows us to examine the relationship between two variables, while removing the influence of a third variable
In our example, looking at the relationship between IQ and grade, we might want to remove the influence of test motivation
i.e. we want to ‘control for’ the effect of motivation
Would we use correlation or partial correlation for this?
Partial correlation
Correlations between the three variables, Grade, IQ and Motivation, without partialling out motivation (zero-order correlations)
r = .522, p = .007
Correlations between Grade and IQ, with motivation partialled out
r = .353, p = .091
What does this result suggest?
The relationship measured between IQ and grade may be explained by the influence of motivation on both IQ and grade
If the correlation between Grade and IQ, with motivation partialled out, had decreased but remained significant, what does this suggest?
The relationship was partially explained by motivation
If the correlation between Grade and IQ, with motivation partialled out, had not decreased, what does this suggest?
The relationship was not influenced by motivation
How many d.p. do we report r in?
3 d.p.
Do we include a 0 in front of the decimal point when reporting r?
No
How many d.p. do we report p in?
3 d.p.
Do we include a 0 in front of the decimal point when reporting p?
No