INTRO TO BIOEPI Flashcards
An art of summarizing data
Statistics
Tool in decision making: Use for formulation of judgement
Statistics
Uses of Biostatistics:
Data reduction ____
Tool for _____ research projects and clinical trials
Tool for _____ appraisal and evaluation of programs
Tool in ______ process and policy making
technique
analyzing
objective
decision-making
Life + Science dealing w/ the collection organization, analysis, and interpretation of numerical data
Biostatistics
deals w/ quantitative and qualitative aspects of vital phenomena
Biostatistics
Application of statistical methods to the life sciences: biology, medicine and public health
Biostatistics
Application of Biostatistics:
study of distribution and determinants of health related states and events in the specified population
Epidemiology
study of the human population: size, structure, composition, and distribution in space
Demography
study the functioning of the health care system, health affecting behaviors
Health Economics
study of hereditary and the genes’ function
Genetics and Genomics
2 Branches of Biostats:
Different methods of summarizing and presenting data for easy analyzing and interpreting
Descriptive Statistics
2 Branches of Biostats:
-Computation of measures of central tendency and variability, location
Descriptive Statistics
2 Branches of Biostats:
-Tabulation and graphical presentation, dispersion
Descriptive Statistics
2 Branches of Biostats:
-Facilitate understanding, analysis, and interpretation of data
Descriptive Statistics
2 Branches of Biostats:
Ex: Constructing a statistical table to show the number of OLFU students according to the degree program.
Descriptive Statistics
2 Branches of Biostats:
methods of arriving at conclusions and generalizations about a target population based on info from a sample
Inferential Statistics
2 Branches of Biostats:
Estimation (point (exact value) & interval (range value)) of parameters and hypotheses testing
Inferential Statistics
2 Branches of Biostats:
Sample population will be tested and results will be
used for generalization of target population
Inferential Statistics
2 Branches of Biostats:
Ex: Determining if there is a difference between prevalence of smoking among students in public and private high schools based on results from a school survey
Inferential Statistics
all members of a specified group
Population
subset of population
Sample
measure of characteristic of a population
Parameter
cannot change, value of characteristics that remains the same
Constant
can change; characteristics that takes on diff values, cannot be predicted w/ certainty
Variable
Research Process : PORRSDDW
Problem Identification/ Hypothesis Objective Formulation Review of Related Literature Research Design Sampling Design and Estimation Data Collection and Processing Data Analysis Writing the Report Dissemination of result
Types of Data According to Source:
obtained first-hand by the investigator; he’s the one who did the survey
Primary Data
Types of Data According to Source:
already existing and have already been obtained, obtained by someone but not for primary purpose of their study
Secondary Data
Types of Data:
Categories are simply descriptions or labels to distinguish one group from another
Qualitative
Types of Data According to Functional Relationship:
Dependent
Independent
Types of Variable:
Categories can be measured and ordered according to quantity or amount and can be expressed numerically.
Quantitative
Types of Variable:
Can assume infinite or countable number/ other possible values
Quantitative
Scale of Measurement of Variables:
Simply used as names or identifiers of a category
Categories are simply labels and cannot be used for meaningful rankings
Nominal (Always Qualitative)
Scale of Measurement of Variables:
Represents an ordered series of relationships
It has inherent or implied ranking system or order
Ordinal (May be Qualitative or Quantitative )
Scale of Measurement of Variables:
Does not have a true-zero value starting point
Categories can be measured but 0 point is arbitrary
Interval (Always Quantitative)
Scale of Measurement of Variables:
Modified interval level w/c includes zero as a starting point
Has fixed 0 point (no value)
Ratio (Always Quantitative)
Systematic procedure to ensure that the info/ data gathered are complete, consistent and suitable for analysis
Data Processing (Necessary step before data analysis )
Flowchart: (Which is the correct order)
a. Data Collection → Data Processing (coding, encoding, editing) → Analysis
b. Data Processing → Data Collection (coding, encoding, editing) → Analysis
a
Conversion of verbal/ written info into numbers w/c can be more easily encoded, counted and tabulated
Data Coding
to permit rapid storage of data, to organize and helps avoid errors, so statistical software can perform various analysis on the data
Data Coding
Types of Code:
Actual value or info given by the respondent, as is
Cannot assign any numerical values (1 response only)
Field Code
Types of Code:
Recorded as range of values rather than actual values
Bracket Code
Types of Code:
Codes are assigned to a list of categories of a given variable
Factual Code
Types of Code:
Applicable for questions w/ multiple responses
Pattern Code
TRUE OR FALSE:
Number of code must be kept to minimum (preferably
less than 8)
TRUE
TRUE OR FALSE:
Codes should be exhaustive and mutually exclusive
TRUE
Codes should be:
Fully comprehensive ______ and do not overlap _____
exhaustive, mutually exclusive
Document w/c contains a record of all codes assigned to the responses to all questions in the data collection forms
Coding Manual
Minimum info that must be included in a coding manual
Variable name: must be kept as short as possible
Variable description: description of the variable in the coding
Coding instructions: actual codes to be used
Entering the data/responses in a spreadsheet: MS Excel, MS Access, Epi Info
Data Encoding
Inspection and correction of any errors or inconsistencies in the info collected
Data Editing
Types of Editing:
Done as soon as the data has been gathered while still in the field
Field Editing
Types of Editing:
Checking of inconsistencies and incorrect entries after receiving the questionnaire from the field
Central Editing
TRUE OR FALSE:
Data Editing makes corrections as early as possible
TRUE
TRUE OR FALSE:
Data Editing reduces non-response or incomplete answers: don’t leave it blank
TRUE
TRUE OR FALSE:
Data Editing eliminates inconsistencies, incorrect information
TRUE
TRUE OR FALSE:
Data Editing makes the entries clear, legible and comprehensive
TRUE
TRUE OR FALSE:
Data Editing prepares data for analysis
TRUE
Method of summarizing and organizing and communicate info using variety of tools
Data Presentation
Methods of Presenting Data:
Describing data by the use of statements w/ few numbers
Narrative or Textual
to stress or emphasize significant info
Methods of Presenting Data:
Convey info that has been converted into words or numbers in rows and columns
Tabular Presentation
Less appealing than graphs
Methods of Presenting Data:
Useful for summarizing and comparing quantitative info of different variables and info w/ different units can be presented together
Tabular Presentation
Components of a Tabular Presentation
Table number Title Column/ box headings/ caption Row headings/ stubs Body of the table Source note Footnote
TRUE OR FALSE:
A table should be self-explanatory. All sources are specified
TRUE
TRUE OR FALSE:
Figures in the table should be aligned by decimal point, and consistency in decimal places
TRUE
Types of Table:
table listing all classes and their frequencies
Frequency Distribution
Nominal and ordinal data, display discrete or continuous data
Types of Table:
Break down the range of values of the observations into a series of distinct, non-overlapping intervals.
Frequency Distribution
Types of Table:
Single table which allows the distribution of observations across many variables of interest in a given study
Master Table (Contains all variables used in the study)
Types of Table:
Complete except for data, Doesn’t contain figures
Dummy Table/ Skeleton Table
Types of Table:
For proposals to show what will happen in the study
Dummy Table/ Skeleton Table
Types of Table According to Number of Variables:
___
___
___
One-way Table: single variable
Two-way Table/ Contingency Table/ Cross Tabulation: 2 variables
Multi-way/: more than 2 variables
% of respondents falling under the column category divided by the total of the category of the row variable
Row %
r ÷ total (row) x 100
% of respondents falling under the row category divided by the total category of the column variable
Column %
c÷ total (column) x 100
Methods of Presenting Data:
Pictorial representations of certain quantities plotted w/ reference to a set of axes
Graphical Presentation
Useful for summarizing, explaining, or exploring quantitative data
TRUE OR FALSE:
Graphical Presentation visually summarizes the variables (data set is large)
TRUE
TRUE OR FALSE:
Graphical Presentation emphasizes particular statement about data set
TRUE
TRUE OR FALSE:
Graphical Presentation enhances readability
TRUE
TRUE OR FALSE:
Graphical Presentation appeals the visual memory
TRUE
Types of Graphical Presentation:
Circles subdivided into a number of slices: area of each slice represents the relative proportion data points falling into given category
Pie chart
Types of Graphical Presentation:
Consists of bars of the same sizes
Bar Graph aka One-Dimensional Diagram
With gap: quantitative discrete
Without gap: quantitative continuous
Types of Bar Graph
Simple Bar Graph
Multiple Bar Graph
Kinds of Bar Graph
Horizontal Bar Graph: for qualitative variables
(presenting towns, proportions, rates of categories)
Vertical Bar Graph: for discrete quantitative variables
(Comparing numerical measurements)
Types of Graphical Presentation:
Each bar is divided into smaller rectangles representing the parts
Component Bar Graph/ Stacked-Bar Graph
Generally used for nominal data
Types of Graphical Presentation:
Plot of dots joined w/ lines over some period of time in sequential series
Line Graph/ Time Series Charts
Horizontal axis: time series
Vertical axis: variable values
Types of Graphical Presentation:
Presentation of frequency distribution of a continuous quantitative variable
Histogram (Preferred for grouped interval data)
Horizontal axis: continuous quantitative
Vertical axis: number of relative frequencies
Bar Graph : ___ gap ; Histogram : ___gap
Bar Graph : ___ data ; Histogram : ___data
with ; without
categorical ; continuous
Types of Graphical Presentation:
Frequencies are plotted against the corresponding midpoints of the classes
Frequency Polygon (continuous quantitative variable)
Types of Graphical Presentation:
Provides rank-ordered lists and its easier to restore the original value of the observation
Stem-and-leaf Plot (Primarily for small set of data)
Types of Graphical Presentation:
Include center, spread, shape, tail length, and outlying data points can be presented horizontal or vertical
Box Plot (Shows description of a large quantitative data)
Types of Graphical Presentation:
Shows the relationship between two quantitative variables (ex: weight and height)
Scatter Plot
Plotted points in line: there is linear relationship
Ascending in line: perfect + (increase left to right)
Descending in line: perfect - (decreases right to left)
Scattered points: no relationship bet x and y
Act of studying or examining only a segment of the population to represent the whole, inferential biostatistics
Sampling
2 Key Features of Sampling
Representative of the population
Adequate sample size
group where representative info is desired and w/c inferences will be made
Target Population
population from w/c a sample will actually be taken
Sampling Population
units w/c are chosen in selecting the sample
Sampling Unit
where w/c a measurement/ observation is made (object or person)
Elementary Unit / Element
collection of all the sampling/ elementary unit
Sampling Frame
deviation from the true value
Sampling Error
Basic Sampling Design:
Probability of each member of the population being selected as part of the sample is difficult to determine or cannot be specified.
Non-probability Sampling
Basic Sampling Design:
Each member of the population has a known non-zero chance of being selected as a sample
Probability Sampling
Non-Probability Sampling Designs :
based on expert’s subjective judgement
Judgmental/ Purposive
Non-Probability Sampling Designs :
those who is available, those who come at hand
Accidental / Haphazard
Non-Probability Sampling Designs :
samples of a fixed size
Quota
Non-Probability Sampling Designs :
individual to be included is identified by a member who was previously included, Referral thru other samples, increases as the study progresses
Snowball
Non-Probability Sampling Designs :
units are easily accessible
Convenience
Probability Sampling Designs :
In this technique elements of the sample are selected using either the lottery method or random numbers generated by a calculator, excel, EpiInfo, etc..
Simple Random Sampling: SRS
Probability Sampling Designs :
Done by taking every element in the population assignment of numbers as a part of the sample.
Systematic Sampling: SYS
-sampling interval (k=N/n)
Probability Sampling Designs :
The population is first divided into non-overlapping groups called: stratum
Stratified Random Sampling
p=n/N
Probability Sampling Designs :
The selection of groups of study units (clusters) instead of the selection of study units individually.
Cluster Sampling (whole group is selected)
Probability Sampling Designs :
A procedure carried out in phases and usually involves more than one sampling method.
Multi-Stage Sampling
Often used in community-based studies