WEEK 4 Flashcards
what is relative risk?
what does a RR of 1 indicate
Calculated as the ratio of “the risk of developing the disease among patients in the exposed group” to “the risk of developing the disease among the patients in the unexposed group”.
The outcome variable must be binary (have only two categories), but the exposure variable could have two or more categories.
TIP: An easy way to remember RR is Relative Risk is the risk of getting a disease.
If the Relative Risk is:
1 = Patients in both groups have the same risk.
> 1 Patients in the exposed group are at increased risk compared to those in the unexposed group.
< 1 Patients in the exposed are at lower risk than the patients in the unexposed.
i.e. The exposure is “protective”.
what is odds ratio?
Odds Ratio (OR)
The Odds Ratio (OR) is generally appropriate for case control studies.
(i.e. Start with cases and controls, and ask both groups about numerous exposures.)
The formula for the Odds Ratio is:
(Odds of being exposed for cases: a / c ) /
(Odds of being exposed for controls: b / d)
i.e. The ratio of the “Odds of exposure of cases” divided by the “odds of exposure of controls”.
TIP: An easy way to remember OR = Odds Ratio is the odds of exposure.
Interpreting the Odds Ratio in words
The correct way to word the Odds Ratio is:
The odds of exposure in cases is ___ times that of controls.
i.e. Phrase as odds of exposure in cases/non-cases, as this technically matches the study design better.
However, given the Relative Risk is written in terms of outcome (i.e. outcome in exposed vs unexposed groups), the Odds Ratio is often written in this way as well because its more intuitive.
odds ratio results
If the OR is:
= 1 The odds of exposure is the same in cases and controls.
> 1 The odds of exposure is higher in the cases than controls
< 1 The odds of exposure is lower in the cases than controls.
(i.e. More exposure in controls than cases)
Hypothesis test for quantifying risk: Odds ratio
95% CI and p-value for OR
We can evaluate the significance of OR by calculating 95% confidence intervals and the p-value by performing a hypothesis test.
Similar to RR, the sampling distribution of OR is not normally distributed; however, the natural logarithm of OR (ln(OR)) follows the normal distribution.
Below are all the formulas you will need:
Steps for calculating 95% CI for OR
If the confidence interval excludes the value of 1 (a value of OR different from 1 is an indication of higher/lower odds), we say that the odds of exposure in the cases and controls are significantly different.
Calculating the p-value
Odds Ratio
Null and alternative hypothesis
We can also calculate the p-value by performing a hypothesis test.
The null and alternative hypotheses are as follows:
Null hypothesis: The odds of exposure between cases and controls in the study population are the same, that is, population OR = 1.
Alternative hypothesis: The odds of exposure between cases and controls in the study population are different, that is, the population OR ≠ 1.
The test statistic for testing these hypotheses is the Z-statistic, which follows the standard normal distribution and is given by:
Z-statistic = ln(OR)/SE.
The p-value is obtained using Table A1: Normal distribution probability table.
Remember: Table A1 gives the area in one tail.
We need two tails to obtain the p-value.
Calculating the p-value for OR
We can also calculate the p-value by performing a hypothesis test.
The null and alternative hypotheses are as follows:
Null hypothesis: The odds of exposure between cases and controls in the study population are the same, that is, population OR = 1.
Alternative hypothesis: The odds of exposure between cases and controls in the study population are different, that is, the population OR ≠ 1.
The test statistic for testing these hypotheses is the Z-statistic, which follows the standard normal distribution and is given by:
Z-statistic = ln(OR)/SE.
The p-value is obtained using Table A1: Normal distribution probability table.
Remember: Table A1 gives the area in one tail.
We need two tails to obtain the p-value.
Relative Risk (RR)
Calculated as the ratio of “the risk of developing the disease among patients in the exposed group” to “the risk of developing the disease among the patients in the unexposed group”.
The outcome variable must be binary (have only two categories), but the exposure variable could have two or more categories.
TIP: An easy way to remember RR is Relative Risk is the risk of getting a disease.
RR OF 1
If the Relative Risk is:
1 = Patients in both groups have the same risk.
> 1 Patients in the exposed group are at increased risk compared to those in the unexposed group.
< 1 Patients in the exposed are at lower risk than the patients in the unexposed.
i.e. The exposure is “protective”.
Calculating 95%CI RR
Interpretation of 95% CI:
If the confidence interval excludes the value of 1 (a value of RR different from 1 is an indication of higher/lower risk), we say that the risk of developing disease in the exposed and unexposed groups are significantly different
Hypothesis test & p-value (RR)
We can also calculate the p-value by performing hypothesis test. The null and alternative hypotheses are as follows:
Null hypothesis: The risk of disease between exposed and unexposed groups in the study population are the same, that is, population RR = 1.
Alternative hypothesis: The risk of disease between exposed and unexposed groups in the study population are different, that is, the population RR ≠ 1.
The test statistic for testing these hypotheses is the Z-statistic, which follows the standard normal distribution and is given by:
Z-statistic = ln(RR)/SE.
The p-value is obtained using the normal distribution probability table.
Remember: Table A1 gives the area in one tail.
We need two tails to obtain the p-value.
Chi-square analysis.
Chi-square (χ2) compares two categorical variables to see if the variation in data is due to chance, or due to the variables being tested.
It is a statistical test commonly used to compare the data of observed frequencies with what we would expect to occur if the null hypothesis was true.
Thus the “expected frequencies” are the frequencies that would occur if the frequency of an event was the same in each group.
Chi-square tests are commonly used to evaluate contingency tables.
Contingency tables
When data has been grouped into categories, we often arrange the counts (frequencies) in a tabular format known as a contingency table or two-way table.
In the simplest case, two dichotomous random variables are involved; the rows of the table represent the categories of one variable (e.g., exposure), and the columns represent the categories of the other variable (e.g., outcome).
The entries in the table are the frequencies (also known as observed frequencies) that correspond to a particular combination of categories.
In a contingency table usually the outcome is presented on the column and the exposure on the row.
Hypothesis testing steps for χ 2 test
Step 1: Establish study design
A common research question could be “is there any association between the two categorical variables (outcome and exposure)?”
Step 2: Set up hypotheses and determine level of significance
This time, the hypotheses are about an “association” between the two categorical variables.
H0: There is no association between the two variables.
Ha: There is an association between the two variables
p-value: 0.05 unless otherwise specified.
Step 3: Select the appropriate test statistic
Step 4: Compute statistic
Step 5: Conclusions
The χ2 test of independence is used to test whether the distribution of the outcome variable is similar across the comparison groups.
Either “do not reject” (if p > 0.05) or “reject” (if p < 0.05) the null hypothesis.
The test assesses whether there is a statistically significant difference in the distribution of the outcome across exposure groups.
A chi-squared test is not valid if
A chi-squared test is not valid if more than 20% of the cells have expected frequency smaller than 5.
In the above example “0%” cells have expected frequency smaller than 5.