EBM Flashcards
odds
One way of expressing the likelihood of an event or outcome (such as pancreatitis) occurring is with odds. The odds of a person having an outcome is the number of individuals with the outcome divided by the number of individuals without the outcome.
If you see 10 people in your clinic one morning and one of them has flu then the odds would be 1 to 9. This can also be expressed as 1 / 9 = 0.11 If 64 out of 256 people in a treatment group had the event (e.g., an outcome such as a disease), the odds would be 64 / (256-64) = 64 / 192 = 0.33
expressing odds in words
There are various ways of describing odds. For example, if the odds are calculated as 0.33, this is the same as saying:
- odds of 1 in 3, or one person had the outcome for every three that didn’t
- the chances of the outcome were one third of the chances of not getting the outcome
- in betting terms: the chances of the outcome were 3 to 1 against, or 0.33 to 1
If the odds are greater than 1, then the event is more likely to occur than not. For example, if the odds are 3, this is the same as odds of 3 to 1 (in betting terminology). If the odds are 1.1, this is equivalent to 11 to 10, or 10% more likely to occur than not
odds ratio
Odds ratios (also known as relative odds) are used to compare whether the likelihood of a certain event occurring is the same for two groups – e.g., smokers versus non-smokers, or a treatment group versus a control group. The odds ratio is the odds of the outcome in one group divided by the odds of the outcome in another group. If the OR = 1 there is no difference between the two groups (i.e., the event is equally likely in both groups).
odds ratio when treatment is compared to control
The odds ratio is the odds of the outcome in a patient in the treatment group divided by the odds of an outcome in the control group. If the OR = 1 there is no difference between the two groups in terms of the likelihood of the outcome. With an OR > 1, there is a greater likelihood of events in the treatment group. With an OR
odds ratio when exposure is compared to no exposure
The odds ratio is the odds of being exposed in subjects with the target disorder divided by the odds of being exposed in control subjects (without the target disorder). An OR of 1.0 implies no association between the exposure and the outcome of interest (e.g., disease). An OR 1 implies that the outcome is associated with exposure, and increases as the exposure increases
example of odds ratio
In a randomised controlled trial, if 64 out of 256 people in the treatment group had the outcome, and 45 out of 180 in the control group also had the outcome, the odds of someone in treatment group having the outcome will be 64/(256-64) and the odds for the control group will be 45/(180-45). In both cases this is 1 in 3 or 33%. So the odds ratio is:
64/192 = 0.33 = 1
45/135 0.33
chi square
NB: with a Chi-square table that’s 2x3 (for example, e.g., 0, 1, 2 drinking; 0, 1 pancreatitis), you work out ORs separately (e.g., for no alcohol compared to low alcohol, and for no alcohol compared to high alcohol), essentially pretending you are working with 2 x 2 tables.
You can then perform a Chi-square analysis to determine whether the relationship shown in the table is significant. The Chi-square test measures the fit of the observed values to ‘expected’ values.
The Chi-square test (or Pearson’s Chi-square) tests a null hypothesis stating that the frequency distribution of certain events observed in a sample is consistent with a particular theoretical distribution. It’s used with categorical variables – e.g., things like whether or not someone has a disease, or has had a particular treatment, and also categories such as blood type.
chi square test of goodness of fit
• A Chi-square test of goodness of fit establishes whether or not an observed frequency distribution differs from a theoretical distribution. A simple application is to test the hypothesis that, in the general population, values would simply occur with equal frequency. But you might also want to test whether a sample from a population would resemble the population. For example, researchers would use a Chi-square goodness of fit test to test whether the frequencies of blood group in a sample match with the frequencies that are seen in the population as a whole
chi square test of independance
• A Chi-square test of independence assesses whether paired observations on two variables, expressed in a contingency table, are independent of each other. Here’s an example of a contingency table showing two variables: Stroke vs. controls, and hypertension vs. no hypertension
crude odds ratio
ORs calculated in this way (from a table such as those shown above) only consider the effect of one variable on the outcome, essentially ignoring the potential influence of other factors. For this reason, they are known as ‘crude’ odds ratios, because they don’t take into account the other variables that may be having an effect on the outcome
confounding variables
A confounding variable, or confounder (e.g., alcohol consumption) has an effect on the outcome (e.g., disease), and is also correlated to the exposure (e.g., smoking) – e.g., people who smoke also tend to drink more.
Common confounders include age, socioeconomic status, and gender. One example that’s often given of a confounding variable comes from the observation that children born later in the birth order (born second, last etc.) are more likely to have Down’s syndrome. But this doesn’t mean we should conclude that birth order causes Down’s syndrome. The relationship between birth order and Down’s is confounded by the mother’s age. Older women are more likely to have children with Down’s. Older women are also likely to be having children who are late in the birth order. So, mothers’ age confounds the association between birth order and Down’s syndrome: it looks like there is an association when there is not
Multiple (or multivariate) logistic regression (see below) controls for many potential confounders at one time.
regression
In simplest terms, regression is a statistical procedure which attempts to predict the values of a given variable (termed the dependent, outcome, or response variable), based on the values of one or more other variables (called independent variables, predictors, or covariates). The result of a regression is usually an equation (or model) which summarises the relationship between the dependent and independent variable(s).
The type of regression used will be dictated by the type of response variable being analysed and by your eventual analytic goal. Linear regression is used to predict the values of a continuous outcome variable (such as height, weight, systolic blood pressure), based on the values of one or more independent predictor variables (and we have encountered simple linear regression in EBM Session 6). Logistic regression is intended for the modelling of dichotomous categorical outcomes (e.g., dead vs. alive, cancer vs. none, pain free vs. in pain).
logistic regression
Logistic regression is used to analyse relationships between a binary/dichotomous dependent variable and numerical or categorical independent variables. Logistic regression combines the independent variables to estimate the probability that a particular event will occur.
In general, logistic regression calculates the odds of someone getting a disease (e.g., pancreatitis) based on a set of covariates (e.g., based on how much someone drinks, smokes etc.)
simple/bivariate logistic regression
Simple logistic regression is used to explore associations between one (dichotomous) outcome and one (continuous, ordinal, or categorical) exposure variable. Simple logistic regression lets you answer questions like, “how does smoking affect the likelihood of having pancreatitis?” This approach is equivalent to that used above, using 2 x 2 tables to calculate an odds ratio, and Chi-square analysis to test the significance of this ‘crude’ OR.
Essentially: Bivariate logistic regression gives you an odds ratio showing the effect of a variable on the outcome, ignoring the effects of other variables
multiple/multivariate logistic regression
Multiple logistic regression is used to explore associations between one (dichotomous) outcome and two or more exposure variables (which may be continuous, ordinal or categorical). The purpose of multiple logistic regression is to let you isolate the relationship between the exposure variable and the outcome variable from the effects of one or more other variables (covariates or confounders). Multiple logistic regression lets you answer the question, “how does smoking affect the likelihood of having pancreatitis, after accounting for (or ‘unconfounded by’ or ‘independent of’) alcohol consumption, BMI, etc.?” This process of accounting for covariates or confounders is also called adjustment.
Comparing the results of simple and multiple logistic regression can help to answer the question “how much did the covariates in the model alter the relationship between exposure and outcome (i.e., how much confounding was there)?”
Essentially: For each variable, multivariate logistic regression gives you an odds ratio showing the effect of a variable on the outcome, after controlling for the effects of the covariates. You can see what the effect of smoking is, regardless of whether a person drinks or not, or has a low or high BMI.