Regression, GLMs and beyond Flashcards
What test do we use if we want to consider the relationship between continuous predictor and response variables?
Regression
(Or correlation)
What does the line of best fit in least squares regression do?
Minimises the squared deviations of the datapoints from the line
Should you just calculate the line of best fit without looking at the data?
No, lots of different patterns of data will return the same line of best fit, there is no substitute for plotting the data
What does a correlation coefficient tell us?
Tells you about the strength of the correlation between two variables
What is the correlation coefficient symbol?
Rho
What values can the correlation coefficient take?
Between -1 and 1
What does a correlation coefficient of 1 tell us?
Our data lies along a perfect straight line with a positive gradient
What does a correlation coefficient of -1 tell us?
Our data lies along a perfect straight line with a negative gradient
Does the correlation coefficient tell us anything about the gradient of the line?
No, it just tells us how well the datapoints lie along the line
What is Pearson’s correlation for?
Linear relationships between two continuous variables
Non-parametric equivalent of Pearson’s correlation
Spearman’s rank correlation
Can be used when the relationship is not linear
General Linear Model for a categorical predictor
Y = A0 + (B1, B2.. B how many levels of the predictor) + e
Y = variable you’re predicting
A0 = constant
B terms = effect of categorical predictor variable
e = error (normally distributed)
General Linear Model for continuous predictors
Y = A0 + A1x1 + A2x2 + error
Y = variable you’re predicting
A0 = constant
A1x1 = gradient of relationship with predictor variable x1
A2x2 = gradient of relationship with predictor variable x2
e = error (normally distributed)
General Linear Model for both categorical and continuous predictors
Y = A0 + A1x1 + (B1, B2….) + e
Y = variable you’re predicting
A0 = constant
A1x1 = gradient of relationship with predictor variable x1
B terms = effect of categorical predictor variable
e = error
What is the test statistic for a GLM?
F ratio
F = treatment mean square / error mean square
(explained variation (signal) / unexplained variation (noise))