Lecture 9: Randomized Controlled Trials and Statistical Power Flashcards
What are RCTs?
- experimental investigation that randomly assigns participants to an intervention or control group
- ensures a balanced distribution of confounders between study groups
- able to determine causality rather than only association
What are the aspects of RCT that are difficult to achieve in nutrition research?
- blinding
- placebo-control
RCTs vs observational studies in terms of randomization
RCTs is able to determine causality because there is an actual intervention and you are holding all of the other factors constant if the randomization is done appropriately. It is different from an observational study bc in an observational study you are not doing anything to your participants, you observe relationships, but you cannot comment on causality. They might be related but you cannot say one caused another. The issue of knowing directionality, what caused the outcome.
CONSORT
Consolidated Standards of Reporting Trials
What do you have report when you are registering your RCT?
Protocol
Outcome measures of interest
- Primary outcome
- Secondary outcome
Why report trials before hand?
- transparency
- track changes
- prevent fishing expeditions where you look at lots of variables and measurements trying to find something interesting.
- keep scientific integrity more robust
What are the commonly used randomization methods?
simple and stratified
Explain simple randomization
each participant is randomly assigned to treatment or control group
Explain stratified randomization
participants are first placed into strata, in order to control for a particular variable, and then participants within strata are randomly assigned to treatment or control group.
What are the potential risks for bias in RCTs
- loss to follow-up (attrition): when participants withdraw from the study or are unreachable
- non compliance: when participants in intervention group do not comply with the intervention (or when participants in control group adopt the intervention)
- these can lead to missing data
True/False
RCTs eliminate confounders and risk of biases.
False
RCTs eliminate the potential for confounders but they don’t eliminate the risk of bias. Just because you randomize your participants, it doesn’t eliminate this risk of error getting into your analysis, it eliminates confounding.
What is bias?
Bias is a systematic or random error that will skew the validity of your results.
Why is lost to follow up is a problem?
- missing data
- if you have disproportionate numbers of participants who finished your intervention than your control then there could be something about those participants who didn’t finish the intervention that would have been important to have their data for.
- journal and peer reviewers pay attention to these numbers.
What are the ways of dealing with missing data in RCTs?
- per protocol analysis
- as treated analysis
- intention-to-treat analysis
Explain the per protocol analysis
includes only participants who have completed the originally allocated treatment (i.e. removes drop-outs and non-compliant participants)
- sample size reduces.
Explain the as treated analysis
groups participants according to the treatment they actually followed, rather than what they were intended to follow (i.e. attempts to mitigate non-compliance)
Explain the intention-to-treat analysis
includes all participants in the groups to which they were randomly assigned, regardless of compliance or study completion (“once randomized, always analyzed”)
How to manage missing data in ITT?
- last value carried forward
(use the last available measured value in the final analysis) - imputation
(use statistical modelling to estimate what the missing value(s) would be if the participants completed the study.)
What is type 1 error?
alpha
it is rejecting a true null hypothesis (false positive)
you say you have found statistically significant results when you actually have not.
What is type 2 error?
Beta
is when you accept a false null hypothesis means that you have incorrectly generated results that are not statistically significant.
What is statistical power?
statistical power is the likelihood that a study will detect an effect when there is an effect there to be detected. If statistical power is high, the probability of making a Type II error, or concluding there is no effect when, in fact, there is one, goes down.
Why to calculate sample size?
- to know what is the minimum sample size needed to detect a statistically significant result?
- done a priori
Why to calculate statistical power?
- given a specified sample size and effect size, what is the statistical power to detect a statistically significant result?
- done after the study
Why is power calculation done after the study?
for examples NHANES, CCHS are collecting data constantly and if you wanted to do a study on that then your sample size is set for you so you cannot control that. So, you run a power calculation to see if you will have enough power to detect a significant result. You need to know your effect size.
What information do we need to know to calculate sample size?
- the statistical test (study design, variable types)
- p-value that will be used as significance cut-off
- power level of analysis (generally set at 0.8)
- measure of effect size (how large the outcome of interest is?)
How to obtain effect size?
Can be estimated by pilot data or previous research. If these are not available, must make an educated guess based on knowledge of the discipline.
What are the categories of effect sizes?
small - 0.2
medium - 0.5
large effect - 0.8
Let’s say you have a really significant p value , but your correlation r value is really small, what does that mean to have really small correlation that is statistically significant?
if you have large sample size, you are going to have high statistical power and you might pick up statistically significant results.
Just because something is statistically significant doesn’t mean that it has practical relevance.
What is the difference between statistical significance and clinical significance?
You might find results that are statistically significant but might not have any relevance practically.
How to guess effect size based on previous data?
look at meta analyses or individual studies
ex in class: the meta analysis had small to medium effect and individual study had a large effect. it is better to be conservative so this topic has small to medium effect size.
What were the inputs of the sample size calculation that we did in class?
- independent two-tailed t-test
- effect size d
- alpha level
- power level
- allocation ratio
What were the inputs of the power calculation that we did in class?
- ANOVA, repeated measures, between factors
- post hoc power analysis
- effect size cohen’s f
- alpha level
- total sample size
- number of groups
- number of measurement
- correlation among repeated measures 0.5 (default value)
What are the type of variables for ANCOVA?
main predictor is categorical, outcome is continuous
What are the type of variables for binary logistic regression?
outcome variable is categorical and dichotomous
What are the type of variables for linear regression?
can use both continuous and categorical variables but the outcome variable is always continuous
How to use categorical variables in regression analyses?
in order to utilize categorical variables in regression analyses, the variables must be dichotomous (dummy variable: typically coded 1 and 0)
What are dummy variables?
method of dichotomizing variables
How many dummy variables are needed for a group 5?
4
it is always number of groups -1
the base case is always 0, 0, …, 0
rest is 1 and 0
When do you not have to dummy code in SPSS?
the dummy coding of binary logistic regression is done automatically for categorical variables that have >2 categories as long as the variable is specified as being categorical in the set-up of the model
What is odds ratio?
- Compares the presence of an exposure to the absence of an exposure, given that we already know about the outcome.
- Appropriate for case-control studies, cross-sectional studies, prospective cohort studies that measure prevalence
What is the equation for odds ratio?
OR = odds that an exposed has outcome/odds that an unexposed has outcome
(a/b)/(c/d) = (ad)/(bc)
When is it appropriate to report odds ratio?
appropriate for case-control studies, cross-sectional studies, prospective cohort studies that measure prevalence
When do you get an odds ratio?
When you do logistic regression, you get an odds ratio rather than a beta value. Odds ratio of 1.25 which is interpreted as 25%
1 = null value meaning there is no difference between groups.
What are the two approaches for regression modelling for covariate assessment?
- some only include significant independent (predictor) variables in the final model. Argue that only these covariates are relevant to the relationship between the main predictor variable and outcome.
- others decide on covariates a priori (based on scientific literature) and include pre-specified covariates in the model regardless of whether or not they are significant predictors. Argue that this limits the number of hypotheses to be tested, provides findings that are more likely to be reproducible, and keeps the interpretation of results consistent across a set of multiple dependent variables (when this is the situation)