10th Feb Flashcards
General linear modelling (GLM)
Helps to indicate if there is an association between 2 variables
Nuisance variables
Confounders and competing exposures - can undermine the interpretation of these associations - we can adjust for these
Stata command regress
Used to build linear models
eg regress weight height (predict weight from height)
Stata command regress with categorical variables
Put xi before regress
Indicate which variables are categorical by putting i in front
eg
xi: regress weight i.sex
Logistic regression
When outcome is categorical eg did attend/ did not attend
eg logistic complynot age
R2
Proportion of variation explained by linear model is R2
R2 is a value between 0 and 1
Higher R2 values indicate that more of your variation is explained by your model
1 is a perfect fit (your model perfectly fits the outcome)
R2 and logistic regression
Fit is reported by psuedo-R2 as standard R2 cannot be calculated for logistic regression