Working with Administrative Data Flashcards
Which of these is an example of administrative data
Text from a tweet Information from a birth certificate Demographic information collected during a baseline survey Income information from tax records None of the above
Both the information from a birth certificate and income information from tax records are data that are collected during the normal operations of a program and not primarily collected for specific research.
Compared to survey data, administrative data are less susceptible to _______ bias because _______
recall bias,
because data are collected at the time of occurrence
T/F: In the context of randomized evaluations, researchers can obtain administrative data from both public (i.e., governmental) and private institutions.
True
Both public and private institutions have provided individual-level administrative data to researchers for the purposes of randomized evaluations.
In regards to administrative data, the _______ identified and sensitive the data that you are asking to be released, the _______ challenging it will be to get those data outside of the agency for research.
more, more
When choosing identifiers for matching study data to administrative data, which of the following identifiers would be preferable to using an individual’s street address
A government-issued, unique identification number
Date of birth
because these are both numerical identifiers as opposed to identifiers comprised of letters and numbers
The exact/deterministic matching strategy may lead to more ___________, while the fuzzy/probabilistic matching strategy may lead to more ___________.
False negatives, false positives
fuzzy and probabilistic matching strategies can account for minor discrepancies, but may lead to more false positives. On the other hand, exact and deterministic matching strategies do not account for minor discrepancies and might lead to more false negatives
During the data matching process, the _______ file and the _______ file are combined to create the _______ file
identified finder, administrative data, de-identified analysis
identified finder file
contains the identifiers of the study sample and a study ID. The study ID is a numeric ID that uniquely identifies each person in the study.
administrative data file
The data provider has the administrative data file that contains identifiers and the outcome variables of interest to the research team.
de-identified analysis file
The data flow process will determine how the identified finder file and the administrative data file are combined to create the de-identified analysis file
In addition to the data provider, who should sign the Data Use Agreement (DUA)?
An official institutional representative
rather than an individual PI or staff member.
If the research team never comes into contact with individuals in the study, they do not need to get IRB approval to use the individuals’ administrative data.
T/F
False
Even though the research team may never come into contact with the individuals whose information is included in the administrative data, it may still be necessary to complete the IRB process, even if just to confirm “exempt” status.
Reporting bias
occurs when people have incentive to under- or over-report information.
Why are administrative data useful?
The outcomes and metrics required for a study may already be tracked by a government or organization
• Available retrospectively
• Enable long-term follow-up
• Reduce logistical burden
• Include near census of relevant population
• Often cheaper than surveys
How do administrative data minimize recall bias?
Data recorded at the time of occurrence– no memory
needed (e.g., banking records)