2021 Epi/Biostats Flashcards
Name 4 ways of dealing with confounding?
- randomization 2. stratification 3. matching 4. regression modelling
Define Per-Protocol Analysis (PP) and Intention-to-Treat Analysis (ITT)
Per-Protocol Analysis (PP) Strategy of analysis in which only patients who complete the entire study are counted towards the results Intention-to-Treat Analysis (ITT) When groups are analyzed exactly as they existed upon randomization (i.e. using data from all patients, including those who did not complete the study)
What kind of study design is this? A study that examines the distribution of BMI by age in Ontario at a particular point in time.
cross-sectional
What’s the difference between Pearson and Spearman correlation?
Different types of correlation are used for different levels of measurement. Pearson is for continuous and Normal data, Spearman is for ordinal or non-Normal data. The Spearman correlation is the same as the Pearson correlation, but it is used on data from an ordinal scale The difference between the Pearson correlation and the Spearman correlation is that the Pearson is most appropriate for measurements taken from an interval scale, while the Spearman is more appropriate for measurements taken from ordinal scales. Examples of interval scales include “temperature in Farenheit” and “length in inches”, in which the individual units (1 deg F, 1 in) are meaningful. Things like “satisfaction scores” tend to of the ordinal type since while it is clear that “5 happiness” is happier than “3 happiness”, it is not clear whether you could give a meaningful interpretation of “1 unit of happiness”. But when you add up many measurements of the ordinal type, which is what you have in your case, you end up with a measurement which is really neither ordinal nor interval, and is difficult to interpret.
What statistical test will you use to compare the mean values of an outcome variable between two groups (e.g. difference in average BP between men and women)
Two-sample Z-Test
What statistical test will you use to test the correspondence between a theoretical frequency distribution and an observed frequency distribution (e.g. if one sample of 20 patients is 30% hypertensive and another comparison group of 25 patients is 60% hypertensive)
A chi-squared test determines if this variation is more than expected or due to chance alone
Define a Secondary Attack Rate and show how to calculate it?
• the proportion of individuals who develop disease as a result of exposure to primary contacts during the incubation period measure of infectiousness, which reflects the ease of disease transmission • = [(total # of cases - initial # of cases) / (# of susceptible individuals - initial # of cases)] * 100%
Investigators intend to study the causes of a rare cancer using a case-control study.
They plan to recruit cases from a national cancer registry, including those diagnosed within the last 3 years.
To avoid bias associated with interviewing proxies about case exposure histories, they decide to exclude all deceased cases, about 40% of the cases in the registry.
Discuss the advantages and disadvantages of this approach.
● Advantages: Avoids differential recall between proxies and cases. Both may be affected by recall bias, but proxies may be unaware of exposure histories during certain time periods (e.g. childhood, young adulthood). Reduce ethical issues that might occur is partners of deceased needed to be contacted.
● Disadvantages: likely excludes most severe disease, meaning cases are less representative of cases in the population. Exposure may impact risk of severe disease differently than less severe disease. Lose study power.
To evaluate a new back school, patients with lower back pain were randomly allocated to either the new school or to conventional occupational therapy. After 3 months they were questioned about their back pain, and observed lifting a weight by independent monitors.
What kind of study design is this?
Randomized controlled trial
To investigate the relationship between certain solvents and cancer, all employees at a factory were questioned about their exposure to an industrial solvent, and the amount and length of exposure measured. These subjects were regularly monitored, and after 10 years a copy of the death certificate for all those who had died was obtained.
What kind of study design is this?
Cohort study
You have received a request to develop a surveillance program focussed for COVID-19. List and briefly describe six attributes of a surveillance system as it would pertain to COVID-19?
- Simplicity: the flow of information is simple (few providers, few systems, same IT structure, easy operation)
- Flexibility: the system is able to adapt to changing information needs or operating conditions – new providers of information, new data requirements, new case definition etc
- Data quality: complete and valid
- Acceptability: willingness of persons and organizations to participate in the surveillance system
- Sensitivity: proportion of cases detected by the system & ability to detect outbreaks and monitor trends
- Specificity: proportion of cases reported that actually have the disease/event of interest
- Timely: speed between steps
- Stable: reliable (able to collect info without downtime) and available (accessible to users when they need to know)
- Representative: represents health information trends by person, place, and time (trends)
Identify 5 mechanisms by which residual confounding can occur after attempts to control for confounding have been made in both the study design and analysis
- Randomisation - too small of a sample or errors in randomising
- Restriction & matching - not tight enough e.g. large age range where comparison groups end up with different age structures
- Not all confounders were accounted for in the analysis because data on them was not collected
- There were misclassification errors of confounders
- Categorisation of confounders was not tight enough e.g. too large of age bands
What do you understand by active and passive surveillance? Provide an example of each.
Active Surveillance
- Outreach such as visits or phone calls by the public health/surveillance authority to detect unreported cases
- e.g. an infection control nurse goes to the ward and reviews temperature charts to see if any patient has
- a nosocomial infection
Passive Surveillance
- A surveillance system where the public health/surveillance authority depends on others to submit standardized forms or other means of reporting cases
- e.g. ward staff notify infection control when new cases of nosocomial infections are discovered
Define standardization.
When do you use Direct Standardization and Indirect Standardization?
Definition: A statistical method used to calculating summary rates of health outcomes that are adjusted to take into account confounders (e.g. age). The standardized rates allow for a less distorted comparison between 2 + populations, showing how overall rates of disease/mortality compare if the 2+ populatoins hypothetically have the same distribution of confounder (e.g. same age distribution)
- Direct Standardization: Used when age-specific rates of disease/mortality are known for the populations being compared.
- Indirect Standardization: Used when it is difficult to obtain reliable estimates of age-specific rates due to small number of observations (therefore unstable rates, or random error)
*
List 3 ways to reduce interviewer bias
- Use standardized questionnaires consisting of closed-end, easy to understand questions with appropriate response options.
- Train all interviewers to adhere to the question and answer format strictly, with the same degree of questioning for both cases and controls.
- Obtain data or verify data by examining pre-existing records (e.g., medical records or employment records) or assessing biomarkers.