Module 3 Flashcards
What is QuODDID stand for?
Qu : Question of scientific interest
O; Outcome measures
D: Design
D: Data analysis
I: Interpretation
D: Dissemination
What is the manual of operations?
A protocol/statistical analysis plan in which you write down exactly what you plan to do before beginning study. Document when and why you deviated from something
Secondary questions involve _____(1) and can help paint a more complete picture of what’s going on.
(1) stratified subgroup analysis
What are the three desirable characteristics of a research question?
- Relevant: Is the question related to the topic of interest?
- Precise: Is the population of interest specified? Is relationship of interest specified?
- Testable by the study: Can we actually assess and answer the question within our study
Remember: Can you Really Print That?
location: brody
A fuzzy research question can live one open to critiques about:
- Research methods not matching th question you intended to ask
- Methods selected to try and increase your chance of a significant finding after the fact
What are the three desirable characteristics of outcome measures?
- Relevant
- Precise and accurate
- “Movable” : likely to be influenced by the interventions
Our ability to achieve accuracy and precision is impacted by _____(1).
(1) the level of granularity we use to measure things
What is the difference between accuracy and precision?
Accuracy: All dots near the centre of the bullseye, but they are all not concentrated in one area.
Precision: All dots concentrated in one area but not near the centre of the bullseye
When we talk about precision and accuracy, what are the three methods we can use? Order them from best to worst.
Numerical measurement > ordered categories > binary categories
If we wanted to measure how fast someone could run:
1. Running speed
2. Is the subject fast/moderate/slow
3. Is the subject fast or not fast
What are the two types of study designs?
- Experiment: Assignment to treatment groups is controlled by the investigator
- treatment assignment my be done using a randomized mechanism or not
- treatment may be masked/blinded (single, double, triple) or not
2, Observational study: Assignment to treatment groups is not controlled by investigator. Participants usually self-select
Who are the four stakeholders in experimental design?
- Investigator: Individual asking the question in design and designing the study to answer it
- participant: individual being observed to understand the impact of the intervention on the outcome measure
- Assessor: Individual evaluating the outcome measure for each of the participants
- Data analyst: Individual analyzing the data
What is single, double, or triple blinded?
Single: The participant doesn’t know which treatment group they are in. Assessor and data analysis know.
Double: Participant and assessor of outcome dont know which treatment group the participant was assigned to. Only the data analyst knows. This protects against assessor evaluating measures in a biased way
Triple: Assessor, participant, and data analyst dont know about treatment groups. This prevents bias from data analyst.
What are the two subcategories of experimental study design?
- Parallel group study: Subjects allocated to receive only one “level” of the treatment being compared. Comparison of outcomes is done ACROSS experimental units(people). Make comparison groups similar through randomization
- Cross-over study: Each subject gets multiple “levels” of the treatment. Comparison of outcomes is within experimental units(people). Each unit acts as “own control/comparator”. Randomization here occurs in terms of what treatment the groups receive first.
In a cross-over study design, you no longer need to worry about comparing groups to groups. Comparison is within each participant. This minimizes _____(1) and there is thus less of a chance of _____(2).
(1) the external characteristics that can influence the primary relationship we care about
(2) confounders
What is the challenge with cross-over studies?
WASH OUT
How do we ensure that the impact of the first exposure/intervention doesn’t linger and affect the outcome measure of their second exposure/intervention?
What two outcome measures did we use for the chocolate trial?
- Individual ratings for two chcolates
- Direct comparison of two chocolars
What is a DAG?
Directed acyclic graph the illustrates relationships between various variables
In a DAG, variables in rectangles are ___(1) while variables in ovals are ____(2)
(1) oberved/measured
(2) not
In a DAG, lines(arrows) represents a _____(1) between variables.
(1) causal relationship
Absence of lines in a DAG means _____(1) given the other vairbales
conditional independence
Randomization can ____(1) from observed and measured variables, from unobserved and unmeasured variables, and even unknown variables
(1) mitigate confounding varibales
What is the importance of removing confounding variables in an experiment?
Allows us to be sure that the associations we observe are reflective of the true relationship of the two variables we want to study
In our chocolate trial, what were possible confounders and how did we remove them from out experiment?
- Tasting order
- Chocoholic gene
When the outcome variable of interest is _______(1) instead of continuous, we talk about ______(2) instead of means
(1) dichotomous(binary)
(2) proportions
How do you calculate the confidence interval for a single proportion?
Refer to module 3 cheatsheet in notes
just know how to calculate p hat, and be familiar with all the terms in the formula for SE
How do you do a hypothesis test for a single proportion?
Calculate confidence interval as you would for a single proportion and then determine the all value and see if it is contained in the CI. If it is, then we fail to reject the null hypothesis
The CI for a difference in proportions is used for questions that seek to compare groups and populations based on their proportions. How do you calculate the CI for a difference in proportions?
Refer to module 3 cheatsheet
How do you do a hypothesis test for a difference in proportions?
Determine 95% CI for a difference in proportions and see if null value is contained within that interval.
What is the three step process involved with hypothesis testing?
- Consider your research question
- Define null hypothesis
- Test you null using your observed data
What is the Texas sharpshooter fallacy?
The name epidemiologists have given tot the tendency to assign unwarranted significance to random data by viewing it post hoc in an unduly narrow context.
What is p hacking?
the misuse of data analysis to find patterns in data that can be presented as statistically significant
What is the key idea behind multiplicity?
Events that are rare on an individual level are still likely to happen across a large group of individuals
A type 1 error is rare on an individual hypothesis test, but ____(1) to happen at least once across a large group of tests.
likeyl
Compare hypothesis generating vs hypothesis testing.
Hypothesis generating: Lots of questions, lots of tests
Hypothesis testing: few questions, few tests
What are the bonferroni corrections?
- divide p-value by the number of comaprisons
- Or make a confdience interval
thats like (100 - 5/(number of comparison)
How would you do multiplicity problems
calculate the probability of the event happening independently and then apply a binomial distribution using that p value
What is a confounding variable? Draw a diagram, and identify confounding, predictor, and outcome variables.
Look at diagram in the notes sheet
Factor Z would be a confounder if
- They differ between levels/categories of X
AND
- They have a causal relationship with Y
What does it mean if something is stratified?
Creating subsets of the data and looking at data on those groups
What is Simpsons paradox?
Simpson’s Paradox refers to a phenomenon in which a trend appears in different groups of data but disappears or reverses when these groups are combined. In other words, the overall percentages in two groups (the treatment and control groups) can be misleading because of a confounder.
Berkeley’s admission data is an example of Simpsons paradox
What are two ways we control for confounders?
- “Control for” potential confounding variables: conduct out comparison of interest within confounding groups, like within a department fro the Berkeley admissions data or within a poverty group for our MSCD and medical expenditure question
- Design an experiment where the treatment and control groups are “otherwise similar” by design.
What is effect modification?
Effect modification occurs when the effect of a single exposure on an outcome depends on the values of another variable
ex. in the chocolate trial it would be called effect modification if the effect of cocoa content on preference depends on a thrid variable
Why is stratifying useful?
Reduce variance rather than bias
answer a secondary question of interest - stratify to look at difference in preference by third variable