EBM Flashcards
what is the odds of a person having an outcome?
number of individuals with the outcome divided by the number of individuals without the outcome
what are odds ratios used for?
(relative odds) are used to compare whether the likelihood of a certain event occurring is the same for two groups
how do you work out OR?
odds of the outcome in one group divided by the odds of the outcome in another group
what do different OR values mean?
- If the OR = 1 there is no difference between the two groups
- With an OR < 1, there is a greater likelihood of events in the control group – and the lower the odds ratio, the more likely that the treatment reduces the risk of events
- An OR < 1 means that the exposure is associated with a reduced likelihood of the outcome. An OR > 1 implies that the outcome is associated with exposure, and increases as the exposure increases.
what does chi-square test measure?
the fit of the observed values to ‘expected’ values
what kind of data is chi-squared used with?
categorical variables
what are the two types of chi-square test?
tests of goodness of fit and tests of independence
chi-square test of goodness of fit
- what does it establish
- what used for
- Establishes whether or not an observed frequency distribution differs from a theoretical distribution
- A simple application is to test the hypothesis that, in the general population, values would simply occur with equal frequency
- But you might also want to test whether a sample from a population would resemble the population
what is the chi-square test of independence?
- what does it assess
- what does value of less than 0.05 tell you
- what do ‘crude’ odd ratios do?
- Assesses whether paired observation on two variables, expressed in a contingency table, are independent of each other
- So, you might perform a Chi-square test (of independence), testing whether the proportion of people with hypertension is significantly different in a group of people who have had a stroke, compared with a control group who haven’t. The null hypothesis in this example is that the proportions in the two groups are no different from each other.
- A chi-square probability (p value) of less than 0.05 is commonly interpreted as justification for rejecting the null hypothesis
- This suggests that there is an association or relationship between the variables, but the test doesn’t tell us what the structure or nature of that relationship is.
- ‘crude’ odds ratios – don’t take into account the other variables that may be having an effect on the outcome
what is a confounding variable?
- A confounding variable, or confounder, has an effect on the outcome and is also correlated to the exposure e.g. people who smoke also tend to drink more
- Common confounders include age, socioeconomic status and gender
what controls for any potential confounders at one time?
Multiple (or multivariate) logistic regression controls for any potential confounders at one time
what is regression? what are the two types? what each used for?
- Statistical procedure which attempts to predict the values of a given variable (dependent, outcome) based on the values of one or more other variables (independent, predictors, or covariates)
- The result of a regression is usually an equation which summarises the relationship between the dependent and independent variable
- Linear regression is used to predict the values of a continuous outcome variable (such as height, weight, systolic blood pressure), based on the values of one or more independent predictor variables
- Logistic regression is intended for the modelling of dichotomous categorical outcomes
logistic regression
- what used to do
- it combines what to estimate what
- what are the two types of logistic regression?
- Used to analyse relationships between a binary/dichotomous dependent variable and numerical or categorical independent variables
- It combines the independent variables to estimate the probability that a particular event will occur
- In general, it calculates the odds of someone getting a disease based on a set of covariates
- simple and multiple
simple (of bivariate) logistic regression
- used for what
- give example
- Used to explore associations between one (dichotomous) outcome and one (continuous, ordinal, or categorical) exposure variable
- How does smoking affect the likelihood of having pancreatitis?
multiple (or multivariate) logistic regression
- what used to explore
- what is purpose
- give example
- Used to explore associations between one (dichotomous) outcome and two or more exposure variables (which may be continuous, ordinal or categorical)
- Purpose is to let you isolate the relationship between the exposure variable from the effects of one or more other variables (covariates or confounders)
- How does smoking affect the likelihood of having pancreatitis, after accounting for (unconfounded by) alcohol consumption, BMI etc.
- = adjustment = accounting for covariates or confounders
- Essentially, for each variable, multivariate logistic regression gives you an odds ratio showing the effect of a variable on the outcome, after controlling for the effects of the covariates
what can comparing the results of simple and multiple logistic regression help with?
to answer the question ‘how much did the covariates in the model alter the relationship between exposure and outcome? (i.e. how much confounding was there)?
where on graph are false positives and false negatives?
make up the area of overlap where the test can’t distinguish normal from disease
how can the relative number of false positives and false negatives be changed?
by shifting the position of the cutoff point (so, as one goes down, the other will go up)
- But the numbers of people correctly identified as having the disease or not having the disease will also change
what are sensitivity and specificity?
- Measures for assessing the performance of diagnostic and screening tests
- Sensitivity is a measure of the probability of correctly diagnosing a condition, whilst specificity is a measure of the probability of correctly identifying a non-diseased person (only concerned with those that have the disease – even if not diagnosed)
- Sensitivity is the proportion of people with the disease correctly identified by the test – the probability that a test results will be positive when the disease is present (referred to as the true positive rate)
- Specificity is the proportion of people without the disease correctly identified by the test – the probability that a test result will be negative when the disease is not present (referred to as the true negative rate) (only concerned with those that don’t have the disease)
what is the positive predictive value? how calculate?
probability that the disease is present when the test is positive (only concerned with positive test results – those to the right of the cutoff point)
- PPV = probability that the disease is present when the test is positive
- True positives / (true positives + false positives)
what is the negative predictive value? how calculate?
probability that the disease is not present when the test is negative (only concerned with negative test results – those to the left of the cutoff point)
- Probability that the disease is not present when the test is negative
- True negatives / (false negatives + true negatives)
what is the equation for sensitivity?
true positives / (true positives + false negatives)
Or true positives / those with the disease
what is the false negatives rate?
false negatives / (true positives + false negatives)
what is the equation for specificity?
true negatives / (false positives + true negatives)
Or true negatives / those without the disease
what is the false positives rate?
false positives / (false positives + true negatives)
or
1 - specificity
what are ROC curves? what is perfect test? what is worthless test?
- These values of sensitivity and specificity can be presented graphically
- This type of graph = Receiver Operating Characteristics curve (ROC curve)
- Plot of the true positive rate (i.e. sensitivity) against the false positive rate (i.e. 1 - specificity) for the different possible criteria (or cutoff points) of a diagnostic test
- Shows trade-off between sensitivity and specificity; any increase in sensitivity will be accompanied by a decrease in specificity
- The area under the curve is a measure of test accuracy
- Accuracy of test depends on how well the test separates the group being tested into those with and without the disease in question
- An area of 1 represents a perfect test – where sensitivity and specificity are both 100%
- An area of 0.5 represents a worthless test – where sensitivity and specificity are both 50% - found with a diagonal line, whereas the close the curve follows the left-hand border and then the top border, the more accurate the test