Lecture 4: Data Sources and study planning Flashcards
Causal Contrast
Compares the disease risk in a group of exposed individuals with the disease risk that would have occurred if these individuals had not been exposed
Problems with doing RCT in pharmacoepi
1) Unethical to study a drug highly suspicious for a serious adverse event
2) Lose real-world setting
3) Expensive and difficult to study rare outcomes or outcomes that require lengthy FU
Types of Data sources used in pharmacoepi
Primary data sources and secondary data sources
Primary Data source
data directly collected from study participants for the purposes of the study (informed consent required)
Secondary Data source
data collected from existing healthcare databases or medical records where all the events have already occurred before the data is queried [not collected with intention to do a scientific study] (informed consent varies)
Types of primary data sources
1) Protocol-required assessments (clinical and/or lab measurements–blood pressure, depression scale AND interview/questionnaire)
These assessments should be part of routine for clinical care
Advantages of primary data sources
Data collection tailored to study objectives, so you focus on measuring confounders, there is available lab data, indication for medication use more explicit, can obtain information from assessments needed for valid measurement, but not universally performed as standard of care, can randomize subjects to treatment (i.e. LST)
Disadvantages of primary data sources
Expensive and time-intensive
Infeasible for studies requiring large sample sizes or long FU
Many operational considerations (getting informed consent, identification, initiation and management of study sites, data monitoring)
Types of Secondary data sources (unstructured data)
Data do not already exist in a structured database
Information from individual patient medical records must be abstracted and converted into structured data for study purposes
Types Secondary data sources (structured data)
Data already exists in a structured (coded) database with de-identified patients
Some sources are:
1) Administrative claims and non-claims databases for insurance companies, programs and health plans
2) EMR, health care registries and record linkage systems [population-based registries] (many in EU)
3) National health surveys, existing cohorts like Framingham
Types Secondary data sources (hybrid data)
Data exist in structured (coded) database, but are supplemented by unstructured data.
Ex. text fields (physician notes are reviewed, coded and added to the structured db)
Some examples of administrative databases
Kaiser Permanente, Veterans affairs, Pharmetrics, United healthcare
Some examples of EMR databases and registries
General practitioner db (THIN, GPRD), Healthcare registries (Sweden & Denmark)
Examples of coding systems
International Classification of diseases (ICD) [9–coding diagnoses and procedures and 10–cause of death], CPT, ATC…
Advantages of secondary data sources
1) Study can be done rapidly and inexpensively
2) Can be used for studies with large sample size or long FU
3) Reduced operational issues
4) pharmacy information more accurate than self-report and medical record
5) data linkage with other dos to obtain additional information (like death, cancer, etc)
Disadvantages of secondary data sources
1) Diagnoses may not be valid (ex. recording of rule-out diagnoses)
2) data on important confounders are unavailable
3) data on over the counter and inpatient drug use is not available
4) Information is truncated if the db has high turnover
Potential sources of bias with administrative db
1) Low SES people may not seek coverage and won’t be represented in db.
2) Incomplete documentation of clinical status
3) miscoding of drug, strength, dose
4) incomplete record keeping
5) Miscoding of primary and secondary diagnoses
6) Incomplete linkage
Feasibility assessments
1) ensure scientific and operational integrity of study
2) ideal study is not wholly feasible
3) purpose is to characterize circumstances in which it is feasible to address research question and identify trade-offs between scientific and operational considerations
4) study objectives drive all aspects of feasibility assessment
Identifying key data elements
1) Identify exposure and outcome variables of interest as well as potential confounding and interaction
2) Identify data elements that can measure these variables
3) List atypical and typical medications of interest
4) List cardiovascular outcomes of interest
5) List variables that contribute to confounding by indication
6) List variables that may modify the effect
Choosing between primary and secondary data collection
Rank data sources for capturing required data elements (sufficient number of patients, recording of lab data for valid measurement of outcome, routine conduct of clinical assessments for valid measurement of confounding diagnoses)
Study Design (Subjects selected according to exposure)
Cohort study
Study Design (Subjects selected according to outcome)
Case-crossover
Study Design (Subjects selected according to neither exposure or outcome)
Cross-sectional
Study Design (Subjects randomized to exposure with observational FU)
Large simple trial (LST)
Descriptive studies
Drug utilization study
Safety surveillance study
Analytic studies
Groups are statistically compared to address pre-specified hypothesis
Sample size estimation
Analytic studies: depend on design, prevalence of E/O, effect size, etc.
Descriptive Studies: sample size estimation not relevant, estimate of sample size required for certain level of precision may be appropriate
Other operational considerations (secondary data sources)
Requirements of IRB (informed consent not required)
Time/funding needed for validation and pilot studies
Are medical records accessible, can sites identify eligible patients, will data abstraction be done by site or study staff?
Operational conditions (primary data collection)
1) Protocol level assessment: identify target countries, sites (academic vs community), and data sources, IRBs?. Informs scientific aspects of study protocol
2) Site-level: evaluation of potential sites for interest (generate list of potential sites and survey them) and capability of study participation