Correlation, Regression and Hypothesis Testing Flashcards
What is the Product Moment Correlation Coefficient (PMCC) and what is its symbol?
A measure of the strength of a correlation and what type it is (positive / negative).
Symbol is r
What value of r would describe a perfect negative correlation?
-1
What value of r would describe a perfect positive correlation?
1
Between what range would a correlation generally be considered ‘weak’?
-0.2 & 0.2
Between what range would a correlation generally be considered ‘strong’?
+/- 0.75 & 1
How do you use your calculator to find the regression line and pmcc (r)?
- Turn frequency off (shift setup statistics)
- Select statistics mode (6)
- Select y = a + bx
- Enter data
- Press OPTN and select Regression Calc
What is interpolation?
Using the line of best fit to make predictions for data that lies within the range of observed data.
What is extrapolation?
Using the line of best fit to make predictions for data that lies outside of the range of observed data ( extending the line of best fit beyond data)
Why is extrapolation often not useful for predicting values?
Relies on the assumption that the trend will continue outside of the range which may not hold true in all scenarios
What is a causal correlation?
When a change in one variable directly affects the other. Some relationships may be correlated but may not be directly the cause of the other (there is a correlation between rates of diabetes and annual income in some groups but this is because they both relate to dietary intake)
What are correlations without a causal connection known as?
Spurious correlations
What is a regression line?
A line of best fit for a correlation
( in form y = mx + c )
What is the constant C term in a regression line?
When x is zero units, C is the predicted number of y
What is the m gradient term in a regression line?
For every increase in x by 1 unit, there must be an increase/decrease in y by m units
How do you create a straight line from a polynomial in form y = axn?
- log(y) = log(axn)
- log(y) = log(a) + log(xn)
- log(y) = log(a) + nlog(x)
Compare to y = mx + c
log(y) = y, log(x) = x, n = m, log(a) = c