Module 1 Part 1 Research Design Flashcards
Different levels of analysis
Univariate
Bivariate
Multivariable
Multivariate
Univariate analysis
Data Summary - single variable
Bivariate analysis
Two variables
Multivariable analysis
Multiple independent variables
Practical steps for data analysis
- Data needs to be prepared for computer entry
- Data needs to be entered into a computer, stored and managed
- Data needs to be summarised and analysed to inform the research question(s) and draw conclusions for the study
Data management stages
- Coding (pre coding or coding after collection)
- Database or spreadsheet design
- Data entry and/or transfer
- Data cleaning - invalid response codes, missing data, duplicate data
- Security and subject confidentiality
Initial checking of questionnaires
Invalid responses
Check for missing data
Illegible writing
Data coding
Coding converts collected information into a suitable format for easy data entry and statistical analysis
Coding mostly converts text responses to numerical responses
Cording manual defines the coding system used
Coding manual
Integral part of data collection and management
Especially important in large studies involving different people
What should a coding manual contain
Definition of each variable
Range of expected responses
Format and instructions for recording data
Instructions for removing uncertainty
Recording of decisions regarding coding
Data entry
Can be into a spreadsheet, database or statistical software
1 data cell = 1 value
Statistical software
Usually looks like a spreadsheet
Has single row for each participant
Each colum represents one variable
Some packages have rules about length and style of column names
Dataset
Name given to a database in statistical software
Vulnerable to human error at several points - coding (researcher), data entry (research or data entry person), data consistency (respondent)
Data cleaning
Screening process to minimise occurrence of mistakes and corresponding impact on results
Accurate data - building blocks of analysis
Errors threaten validity of measures and impact on data analysis and results
Checking for coding errors and outliers
Verify data entry
Check data consistently
Strategies for increasing accuracy of data entry
Double entry - 2 independent enteries of the original data, discrepancies are checked and corrected
Random checking - subset of records are checked against original data
Create data entity template/form that does not accept data that doesn’t fit acceptable values