Mulitivariable data analysis part one Flashcards
define linear model
helps to indicate if their is a relationship between 2 variables
2 types of nusicance variables which undermine the association between 2 variables (clue: types of covariates)
confounders
competing exposures
is the outcome the independent or dependent variable
dependent.
is the exposure the dependent or independent variable
independent.
how do you find the M in y=mx +c (gradient)
m= chnage in y/ change in x.
what is the question for a simple line graph
y- mx+ b +e
mean part = y=mx+b
residual part= e- N(0, σ^2)
what does the “E” represent
residual
what affects “E”
“noise”- the more noisy your date the worse the model is at predicting the outcome and the larger the E (and thus σ^2)
what does N (0 , σ^2)
normal distribution of the residual with a mean of 0 and a variance of σ^2
what factors can produce noise/ variation in a model
- most likely (random variation)
- error
- more covariant are needed to model the relationship
what does a larger σ mean
more noise
what does the line of best fit show
predicts the relationship between 2 variables.
it more of less residual better + why
LESS
- The less residual variation
- The better the fit of the model -more confident
- statistically ‘significant’
What stata command is used to make linear models
regress
if you want to regress 2 variables (e.g. age + weight), what command would be used
regress age weight