chi-square test of goodness of fit - what does it establish - what used for

- Establishes whether or not an observed frequency distribution differs from a theoretical distribution - A simple application is to test the hypothesis that, in the general population, values would simply occur with equal frequency - But you might also want to test whether a sample from a population would resemble the population

what is the chi-square test of independence? - what does it assess - what does value of less than 0.05 tell you - what do 'crude' odd ratios do?

- Assesses whether paired observation on two variables, expressed in a contingency table, are independent of each other - So, you might perform a Chi-square test (of independence), testing whether the proportion of people with hypertension is significantly different in a group of people who have had a stroke, compared with a control group who haven’t. The null hypothesis in this example is that the proportions in the two groups are no different from each other. - A chi-square probability (p value) of less than 0.05 is commonly interpreted as justification for rejecting the null hypothesis - This suggests that there is an association or relationship between the variables, but the test doesn’t tell us what the structure or nature of that relationship is. - ‘crude’ odds ratios – don’t take into account the other variables that may be having an effect on the outcome

logistic regression - what used to do - it combines what to estimate what - what are the two types of logistic regression?

- Used to analyse relationships between a binary/dichotomous dependent variable and numerical or categorical independent variables - It combines the independent variables to estimate the probability that a particular event will occur - In general, it calculates the odds of someone getting a disease based on a set of covariates - simple and multiple

simple (of bivariate) logistic regression - used for what - give example

- Used to explore associations between one (dichotomous) outcome and one (continuous, ordinal, or categorical) exposure variable - How does smoking affect the likelihood of having pancreatitis?

multiple (or multivariate) logistic regression - what used to explore - what is purpose - give example

- Used to explore associations between one (dichotomous) outcome and two or more exposure variables (which may be continuous, ordinal or categorical) - Purpose is to let you isolate the relationship between the exposure variable from the effects of one or more other variables (covariates or confounders) - How does smoking affect the likelihood of having pancreatitis, after accounting for (unconfounded by) alcohol consumption, BMI etc. - = adjustment = accounting for covariates or confounders - Essentially, for each variable, multivariate logistic regression gives you an odds ratio showing the effect of a variable on the outcome, after controlling for the effects of the covariates

EBM Flashcards by Ella Fazakerley

what is the odds of a person having an outcome?

number of individuals with the outcome divided by the number of individuals without the outcome

How well did you know this?

Not at all

Perfectly

what are odds ratios used for?

(relative odds) are used to compare whether the likelihood of a certain event occurring is the same for two groups

How well did you know this?

Not at all

Perfectly

how do you work out OR?

odds of the outcome in one group divided by the odds of the outcome in another group

How well did you know this?

Not at all

Perfectly

what do different OR values mean?

If the OR = 1 there is no difference between the two groups
With an OR < 1, there is a greater likelihood of events in the control group – and the lower the odds ratio, the more likely that the treatment reduces the risk of events
An OR < 1 means that the exposure is associated with a reduced likelihood of the outcome. An OR > 1 implies that the outcome is associated with exposure, and increases as the exposure increases.

How well did you know this?

Not at all

Perfectly

what does chi-square test measure?

the fit of the observed values to ‘expected’ values

How well did you know this?

Not at all

Perfectly

what kind of data is chi-squared used with?

categorical variables

How well did you know this?

Not at all

Perfectly

what are the two types of chi-square test?

tests of goodness of fit and tests of independence

How well did you know this?

Not at all

Perfectly

chi-square test of goodness of fit

what does it establish
what used for

Establishes whether or not an observed frequency distribution differs from a theoretical distribution
A simple application is to test the hypothesis that, in the general population, values would simply occur with equal frequency
But you might also want to test whether a sample from a population would resemble the population

How well did you know this?

Not at all

Perfectly

what is the chi-square test of independence?

what does it assess
what does value of less than 0.05 tell you
what do ‘crude’ odd ratios do?

Assesses whether paired observation on two variables, expressed in a contingency table, are independent of each other
So, you might perform a Chi-square test (of independence), testing whether the proportion of people with hypertension is significantly different in a group of people who have had a stroke, compared with a control group who haven’t. The null hypothesis in this example is that the proportions in the two groups are no different from each other.
A chi-square probability (p value) of less than 0.05 is commonly interpreted as justification for rejecting the null hypothesis
This suggests that there is an association or relationship between the variables, but the test doesn’t tell us what the structure or nature of that relationship is.
‘crude’ odds ratios – don’t take into account the other variables that may be having an effect on the outcome

How well did you know this?

Not at all

Perfectly

what is a confounding variable?

A confounding variable, or confounder, has an effect on the outcome and is also correlated to the exposure e.g. people who smoke also tend to drink more
Common confounders include age, socioeconomic status and gender

How well did you know this?

Not at all

Perfectly

what controls for any potential confounders at one time?

Multiple (or multivariate) logistic regression controls for any potential confounders at one time

How well did you know this?

Not at all

Perfectly

what is regression? what are the two types? what each used for?

Statistical procedure which attempts to predict the values of a given variable (dependent, outcome) based on the values of one or more other variables (independent, predictors, or covariates)
The result of a regression is usually an equation which summarises the relationship between the dependent and independent variable
Linear regression is used to predict the values of a continuous outcome variable (such as height, weight, systolic blood pressure), based on the values of one or more independent predictor variables
Logistic regression is intended for the modelling of dichotomous categorical outcomes

How well did you know this?

Not at all

Perfectly

logistic regression

what used to do
it combines what to estimate what
what are the two types of logistic regression?

Used to analyse relationships between a binary/dichotomous dependent variable and numerical or categorical independent variables
It combines the independent variables to estimate the probability that a particular event will occur
In general, it calculates the odds of someone getting a disease based on a set of covariates
simple and multiple

How well did you know this?

Not at all

Perfectly

simple (of bivariate) logistic regression

used for what
give example

Used to explore associations between one (dichotomous) outcome and one (continuous, ordinal, or categorical) exposure variable
How does smoking affect the likelihood of having pancreatitis?

How well did you know this?

Not at all

Perfectly

multiple (or multivariate) logistic regression

what used to explore
what is purpose
give example

Used to explore associations between one (dichotomous) outcome and two or more exposure variables (which may be continuous, ordinal or categorical)
Purpose is to let you isolate the relationship between the exposure variable from the effects of one or more other variables (covariates or confounders)
How does smoking affect the likelihood of having pancreatitis, after accounting for (unconfounded by) alcohol consumption, BMI etc.
= adjustment = accounting for covariates or confounders
Essentially, for each variable, multivariate logistic regression gives you an odds ratio showing the effect of a variable on the outcome, after controlling for the effects of the covariates

How well did you know this?

Not at all

Perfectly

what can comparing the results of simple and multiple logistic regression help with?

Study These Flashcards

to answer the question ‘how much did the covariates in the model alter the relationship between exposure and outcome? (i.e. how much confounding was there)?

where on graph are false positives and false negatives?

Study These Flashcards

make up the area of overlap where the test can’t distinguish normal from disease

how can the relative number of false positives and false negatives be changed?

Study These Flashcards

by shifting the position of the cutoff point (so, as one goes down, the other will go up)
- But the numbers of people correctly identified as having the disease or not having the disease will also change

what are sensitivity and specificity?

Study These Flashcards

Measures for assessing the performance of diagnostic and screening tests
Sensitivity is a measure of the probability of correctly diagnosing a condition, whilst specificity is a measure of the probability of correctly identifying a non-diseased person (only concerned with those that have the disease – even if not diagnosed)
Sensitivity is the proportion of people with the disease correctly identified by the test – the probability that a test results will be positive when the disease is present (referred to as the true positive rate)
Specificity is the proportion of people without the disease correctly identified by the test – the probability that a test result will be negative when the disease is not present (referred to as the true negative rate) (only concerned with those that don’t have the disease)

what is the positive predictive value? how calculate?

Study These Flashcards

probability that the disease is present when the test is positive (only concerned with positive test results – those to the right of the cutoff point)

PPV = probability that the disease is present when the test is positive
True positives / (true positives + false positives)

what is the negative predictive value? how calculate?

Study These Flashcards

probability that the disease is not present when the test is negative (only concerned with negative test results – those to the left of the cutoff point)

Probability that the disease is not present when the test is negative
True negatives / (false negatives + true negatives)

what is the equation for sensitivity?

Study These Flashcards

true positives / (true positives + false negatives)

Or true positives / those with the disease

what is the false negatives rate?

Study These Flashcards

false negatives / (true positives + false negatives)

what is the equation for specificity?

Study These Flashcards

true negatives / (false positives + true negatives)

Or true negatives / those without the disease

what is the false positives rate?

false positives / (false positives + true negatives) or 1 - specificity

what are ROC curves? what is perfect test? what is worthless test?

- These values of sensitivity and specificity can be presented graphically - This type of graph = Receiver Operating Characteristics curve (ROC curve) - Plot of the true positive rate (i.e. sensitivity) against the false positive rate (i.e. 1 - specificity) for the different possible criteria (or cutoff points) of a diagnostic test - Shows trade-off between sensitivity and specificity; any increase in sensitivity will be accompanied by a decrease in specificity - The area under the curve is a measure of test accuracy - Accuracy of test depends on how well the test separates the group being tested into those with and without the disease in question - An area of 1 represents a perfect test – where sensitivity and specificity are both 100% - An area of 0.5 represents a worthless test – where sensitivity and specificity are both 50% - found with a diagonal line, whereas the close the curve follows the left-hand border and then the top border, the more accurate the test

EBM Flashcards

(26 cards)