ensuring completness of data Flashcards
what things can cause missing data
> discontinuation of the study intervention
refusal to take a measure
attrition
missing composite outcomes
death
why problem does missing data pose?
potentially biasing the results
what are sensitivity analyses?
Involves testing the robustness of the results by using different methods to handle the missing data and ocmparing the reuslts from each method. can see how much of the results are affected by the method used to handle missing data
for a trial that includes missing values. What needs to occur for the trial to be considered valid?
when sensible methods for dealing with missing values are used.
especially if these methods are pre-defined in the protocol
investigating sensitivity of the reuslts when using a method of handling missing values. Done by performing sensitivity analyses.
when using statistical mehtods to account for missing data we need to understand the underlying reason why the data is missing. what reasons could these be?
o Missing completely at random
o Missing at random
o Missing not at random
Explain missing completely at random (MCAR) data
missing data not related to any observed or unobserved variables in the dataset. missingness is randomly distributed across the dataset thereofre the remaining data is still representative of the population of interest
Explain missing at random (MAR)
missing data is related to an observed variable in the dataset. E.g., n with high score on test A, more likely to drop out. missing data is systematically related to the observed data but not the missing data itself.
Explain Missing not at random (MNAR) data
missing data is related tot he missing data itself or an unobserved variable in the dataset. Missingness is not random and cant be explained by any observed variable. E.g., n feeling depressed might drop out. this results in MNAR data as the missingness cannot be explained by any variable in teh dataset. Bas because the missing data might contain information that is systemically different from the observed data and bias the results if the missing data is not handled properly
why do we need to know the mechanism of missing data (the underlying reason for it?)
it can guide the choice of statistical methods to handle missing data.
what different statistical methods handle missing data?
- Available case analysis
- Baseline observation carried forward
- Last observation carried forward
- Mean imputation
- Regression imputation
- Multiple imputation
- Mixed effect model
- Tipping point analysis
multiple imputation
creates multiple dataset with plausable values for the missing data
performs standard analyses on each of the imputed datasets
combines the results from the analyses using rubin rules
tipping point analyses
changes values parameters until a result is tipped from statistical significance to non significance or vice versa
used as a sensitivity analyses to test the robustness of the primary results
can be used to explore the impact of MNAR
what is a way of handling missing data affected by pandemic
- clarify the treatment estimand of interest with respect to the occurence of the pandemic
- establish what data is missing for the chosen estimand
- perform primary analysis under the most plauible missing data assumptions
- perform sensitivity nalyses under alternative plausible assumptions
How an logistic regressio be used to perform a sensitivity analyses?
logistic regression models to identify any baseline characteristics that were predictive of missingness or non-response. These covariates were then included in the primary analysis as a sensitivity test. This means that the primary analysis was conducted both with and without the identified predictive covariates to assess the robustness of the results.
By including the predictive covariates in the analysis, the authors were able to determine whether the associations between other variables and the availability of primary outcomes were affected by the identified baseline characteristics. This approach is a form of sensitivity analysis, as it assesses the impact of including or excluding certain covariates on the results of the primary analysis