Correlation and Linear Regression Flashcards
What correlation and linear regressions were discussed in the lecture?
- Pearson’s correlation/Spearman rank correlation
- Linear regression
When do you use Pearson’s correlation and Spearman rank correlation?
- Pearson = normally distributed variables
- Spearman = non-normally distributed variables
What does correlation quantify?
Quantifies the strength and direction of association between two numerical variables
Should correlations be interpreted with caution?
Yes, correlations should be interpreted with great care because they do not necessarily indicate causation.
What are some possible reasons for correlations between variables?
Correlations between variables can result from:
- a causal relationship
- shared dependency on some third unmeasured variable
- coincidence
Why are correlated time series unreliable indicators of causal relationships?
Correlated time series are unreliable indicators of causal relationships because over time a variable can only follow four possible trajectories (steady state, increase, decrease, or fluctuation), and there are bound to be many coincidences.
What is Pearson’s product-moment correlation (r)?
Pearson’s product-moment correlation (r) is a statistical method that compares two numerical continuous variables and ranges in value from -1 through 0 to +1.
What rules must be followed when using Pearson’s product-moment correlation?
- the first action should be to draw a scatterplot
- both variables must be continuous & normally distributed (check for normality)
- if these assumptions are not met, a Spearman’s rank-order correlation (non-parametric correlation) should be used
What is H0 in Pearson’s product-moment correlation (r)?
H0 in Pearson’s product-moment correlation (r) states that the two variables are not correlated.
What is H1 in Pearson’s product-moment correlation (r)?
H1 in Pearson’s product-moment correlation (r) states that the two variables are correlated.
What are the two ways to calculate Pearson’s product-moment correlation (r) in Excel?
The two ways to calculate Pearson’s product-moment correlation (r) in Excel are:
- through the Analysis Toolpak (“Correlation”)
- by using the function key (“=CORREL”).
What statement is included if a Pearson correlation test is carried out?
Reject H0: “There was a significant correlation between ‘variable 1’ and ‘variable 2’ (r = ___ , df = __, p < 0.05).”
Accept H0: “There was no significant correlation between ‘variable 1’ and ‘variable 2’ (r = ___ , df = __, p > 0.05).”
NOTE: do not state reject/accept H0
What are 2 ways we can do the Pearson correlation test in R studio?
> cor()
cor.test()
(slide 7)
How much of the variation in one variable can be explained by the other variable if we express the correlation coefficient as r^2?
r^2 indicates the proportion of variance in the dependent variable that can be explained by the independent variable.
What caution should be taken when interpreting a significant correlation coefficient with a big sample size?
A big sample size can lead to a highly significant correlation but may explain a very small percentage of the variation. Therefore, it is important to carefully evaluate the practical significance of the relationship between the variables.