Correlation & Linear regression Flashcards
Correlation definition
the strength of association between two variables
How can correlation be summarised?
Graphically - scatterplots
Numerically - correlation coefficients
What is a correlation coefficient (r)
numbers that quantify the strength of association between two variables
what do different ‘r’ values show
r > 0 - variables increase together
r < 0 - one up, one down
r = 0 - no association
r = 1/-1 - perfect association
When is Pearson correlation coefficient used?
for linear association
When is spearman correlation coefficient used?
for non-linear association
What does monotonic mean?
either a scatterplot relationship is never positive or never negative
When can neither Pearson or Spearman be used?
when the relationship is non-monotonic
What is coefficient of determination? (R squared)
the proportion of the variation in one variable that is explained by another variable
How is R squared calculated?
by multiplying Pearson correlation coefficient by itself
What do R squared values show?
0 = no variation is explained
1 = all variation is explained
What is linear regression used to do?
estimate a mathematical equation that describes the linear relationship between a quantitative outcome and a quantitative predictor
Which axis is the outcome plotted on?
y axis (vertical)
Which axis is the predictor plotted on?
x axis (horizontal)
What is the linear regression equation?
Outcome = a + b x predictor
a = mean expected outcome when predictor is 0
b = slope (outcome increase due to predictor variation)