Data Acquisition Approach ll Flashcards
Primary data collection method (3)
- Experimental method
- lab experiment
- controlled trials - Observation method
- Survey method
- in-depth interview
- focus group discussion
- questionnaire
Issues regarding questionnaire constructions (5)
- Structural
- Content
- Language
- Pre-testing
- Reliability & validity
Steps in constructing questionnaire (4)
- Determine what information is needed
- Drafting of questions
- Pre-testing of questionnaire
- Revision of questionnaire, if needed
Structural issues (2)
- mode of delivery of questionnaire
- open-ended or close-ended
General format
- Introductory statement
- Demographics questions
- can be last section also - Factual questions
- Opinion questions
- Closing statements & return instructions
Question context (2)
- administration
- delivery of questionnaire
- Administration of questionnaires
- self administered
- interviewer-administered - Delivery of questionnaire
- mails / emails
- telephone
- in person
- determines how questions & response options are constructed
Types of question formats (2)
- Open-ended
- Close-ended
- checklists
- rankings
- rating scales
Examples of rating scales (5)
- Unipolar scale
- Bipolar scale
- Likert scale
- Visual Analogue Scale (VAS)
- Wong-Baker face scale
Advantages of open-ended response (4)
- more detailed answers
- can study respondents’ interpretations expressed in their own words
- minimise successful guessing (knowledge question)
- useful to wrap up entire survey at the end
Disadvantages of open-ended response (7)
- Less structured
- Difficult to ensure systematic recording of response (if interviewer administered)
- Difficult to maintain unbiased while adequately probe for more complete/understandable answer
- More difficult for respondents to answer compared to close-ended response
- More time required to complete such question
- ~75% of respondents will leave such questions blank (if self-administered)
- Difficult to code information for data analysis
- wider variety of response
Advantages of close-ended response (4)
- Tightly structured
- only options for respondents to choose from - Ensures standardisation of response
- Easily encoded & analysed
- Less time taken to collect response
- respondents take a shorter time to answer
Disadvantages of close-ended response (4)
- Less depth in answers
- Imposes researchers priorities on respondent
- May bias responses if range of options are not exhaustive
- Presentation formats may affect responses
How to write good questions? (4)
- be clear
- be concise
- provide complete options in close-ended questions (avoid gaps or overlap categories)
- bias & leading questions
Clarity in questions by avoiding __ (4)
- big words
- technical jargons (use layman terms)
- double negatives
- double-barreled questions
eg questions with or, and
Language issues (3)
- translate to other languages
- cross cultural adaptation of survey instrument (translate to appropriate language)
- back translate to ensure meaning of questions is retained
Pre-testing (4)
- pilot test with a group of respondents
- ensure all questions are understood as intended, otherwise rephrase to capture intended meaning
- check length of questionnaire
- ensure questionnaire is able to obtain adequate information to answer research question
Strengths of self-administered questionnaire (3)
- Cheap to administer
- Less susceptible to interviewer bias
- Can be administered via mail / email
Limitations of self-administered questionnaire (3)
- Lower response rate
- lead to non-response bias - Difficult to elicit detailed response
- Less control over how questionnaire is filled out
Strengths of interviewer-administered questionnaire (3)
- Higher response rate
- More detailed responses can be elicited
- Greater control over how questionnaire is filled out
Limitations of interviewer-administered questionnaire (3)
- Expensive to administer
- More susceptible to interviewer bias
- More time consuming
- interviewer has to be present to collect response
Response rate
= (no. of participants who completed the questionnaire)/(total no. of eligible persons who were asked to participate)
Concerns over low response rate (3)
- leads to non-response bias as respondents may differ from non-respondents in their characteristics, hence answers of respondents may differ from the potential answer of non-respondents
- compromise internal validity
- weakens external validity & generalisability of survey results
When to use mail/email mode of delivery of questionnaire? (6)
- Geographically disperse sample groups
- Directed to specific groups
- can obtain specific emails - Limited research budget & manpower
- Provide time for respondents to think before answering
- Provide privacy for respondents
- Questions should be closed-ended, simple & clear
When to use telephone interviews? (2)
- Open-ended questionnaire
- Complex interviews
eg req skipping of questions depending on respondent’s previous answer
CATI
Computer Assisted Telephone Interviews
- directly key in respondent’s answers into computer system
- ensure quality & speed
When to use in-person interviews? (3)
- Open-ended questionnaire
- Complex interviews
eg req skipping of questions depending on respondent’s previous answer - Knowledge-based questions
CAPI
Computer Assisted Personal Interviews
- directly key in respondent’s answers into computer
- ensure quality & speed
Practicality considerations when using mail/email to delivery questionnaires (3)
- Accessibility of sample
- req specific emails/mailing address - Low response rate (5-30%)
- Think of ways to increase response rate
eg incentives, provide return envelope with postage, reminders, deadlines
Practicality considerations when using telephone interviews (2)
- Accessibility of sample
- req contact details - Good response rate (65-75%)
Practicality considerations when using in-persons interviews (2)
- Accessibility of sample
- req locations of sample - Good response rate (70-80%)
Turn Around Time (TAT)
= (no. of surveys to be collected x duration of each interview) / (no. of interviewers x no. of hours each interviewers works per day)
Census
Entire population
Sampling
- a subset of the population
Sample size calculations for Questionnaires (2)
=/ sample size calculator for clinical trials
- Cochran’s formula
- large sample size
- small sample size (>5%) - Yamane’s formula
- fixed 95% CI
- p=0.5
Cochran’s formula (large population)
no = [(Z^2)(pq)] / (e^2)
no = sample size calculated
Z = Z score of normal distribution = 1.96 at 95% CI
p = estimated proportion of an attribute that is present in the population
= 0.5 (if unknown)
q = 1-p
e = desired level of precision
= 0.05
Cochran’s formula (small population)
- when no/N > 5%
n = no / [1+[(no-1)/N]]
n = sample size for small population no = sample size calculated for large population N = population size
Yamane’s formula
- assuming 95% CI and p=0.5
n = N / [1+N(e^2)]
n = sample size calculated
N = population size
e = desired level of precision
= 0.05
Types of sampling (2)
- Random / Probability sampling
2. Non-random / Non-probability sampling
Types of Random / Probability sampling methods (4)
- Simple random sampling
- Systematic random sampling
- Stratified random sampling
- Cluster sampling
Types of Non-random / Non-probability sampling (3)
- Convenience sampling
- Quota sampling
- Snow-ball sampling
Simple random sampling (2)
- every subject has an equal probability of being selected
- req a full list
Systematic random sampling (4)
- req a full list
- generate a random number to determine the _th item to select
- generate another random number from 1 to _ to determine the starting number
- _ is determined by dividing population size by desired sample size
Stratified random sampling (2)
- divide population into relevant strata
- random sampling from each strata
Cluster sampling (3)
- divide populations into clusters
- randomly select a subset of clusters
- all units or random sample units within subset of clusters are surveyed
Convenience sampling
- select participants whom access is easy
Quota sampling
- reserve a certain proportion of participants to particular types of people
eg gender
Snow-ball sampling (2)
- selected participants nominate others whom they know could be interviewed
- good to use if target population is difficult to be identified / accessed
eg drug addicts
Secondary data sources (6)
- Prescription/medical & dispensing records
- Registry data
- *
3. Claims databases
4. Cross-sectional survey data
5. Large prospective cohort data
6. Spontaneous reporting / surveillance data
Data sources with either exposure data or outcomes data only (2)
- req record linkage
eg via IC - combining the information from both data belonging to same individual into one record so that the same person is counted only once
Strengths of medical/prescription/dispensing records (7)
- Clinical data available
eg labs - Best source for disease outcome
- Efficient
- data already collected, large sample size & long follow up data - Prospectively collected data
- Detailed information on medical records
- Can study many drugs in relation to many outcomes
- Can conduct nested case-control studies
Limitations of medical/prescription/dispensing records (5)
- Compliance to medication unknown
- Non-prescription drugs missed
eg OTC & GSL - Diagnosis based on codes / texts
- Incomplete information about habits, past history & other potential confounders
- Uncertain completeness of data from other physicians & sites of care
- incomplete data
Types of registry data (2)
- National Registry of Disease Office (NRDO)
- cancer
- renal
- stroke
- acute myocardial infarction
- donor care (liver & renal) - Drug use registry
eg pregnancy registries for anti-epileptic drug use
Strengths of registry data (4)
- Details & systematically collected clinical data
- Representative of patients / drug users
- Inexpensive if data already available
- High reliability of data
- periodic audits
Limitations of registry data (4)
- Minimal information on drug use
- No appropriate control or comparison groups
- can use other drug users as a comparison or other disease as a control - Expensive if data not readily available & have to create one
- Missing data
Considerations in collecting/using secondary data for use (4)
- Reliability
- data collection process
- reproducibility - Suitability
- can the data collected answer the research question - Adequacy of data
- completeness of data - Ethical considerations of data use
- may involve IRB
- PDPA
- de-identification of data