Regression Flashcards
what are regressions
- Correlations are advanced by regression
- Regression can be used for: making future predictions/prognosis – can look at multiple factors (age, sex ect and say what mean life expectancy would be)
- Understanding how a treatment or disease works (or if it is influenced by different risk factors)
regression equations
y = a + b x
what does the ‘b’ in the equation stand for
b (coefficient) change in y when we increase x by 1 unit , in most scenarios b is the main intersect
what does the ‘y’ stand for
y variable is the outcome (the variable we are estimating/predicting)
in most scenarios it is the aspect we are interested in, i.e. what can ‘change’ (pain score, BP reading, number of children)
the outcome variable sometimes referred to as the dependent or response variable
what does the ‘x’ stand for
x variable is the predictor (the variable we are using to estimate)
in many scenarios it is the aspects that cannot change (age, deprivation level, family history of illness)
predictor sometimes called independent or explanatory
what does the ‘a’ stand for
a (intercept), the value of y when x=0 (often no real use/interpretation)
what is a linear regression?
where the outcome is a single continuous variable
what is logistic regression
has a binary outcome (pass/fail)
what are the advantages of multiple regression
- one of the main benefits of regression is being able to incorporate additional variables
- we can incorporate these different factors into a regression model as additional predictors
accounts for background factors
account for confounding variables
lessens bias
multiple regression equation
y = a + b1x1 + b2x2 + b3*x3 ect.
Categorical variables as predictors in regression models
Regression is not restricted to continuous predictors
Categorical variables always have a ‘reference’ category
Each coefficient estimate is in comparison to that reference category
the intercept value sometimes has more relevance now as it incorporates the reference group