biostats Flashcards
a group of methods used to collect, analyze,
present, and interpret data and to make decisions.
statistics
Decisions made by using
statistical methods
Educated guesses
Decisions made without using
statistical or scientific methods
Pure guesses
Statistics has two aspects
: theoretical and applied statistics.
deals with the development, derivation, and
proof of statistical theorems, formulas, rules, and laws.
Mathematical/Theoretical statistics
involves the application of those theorems, formulas, rules, and
laws to solve real-world problems (e.g. economics, psychology, public health).
Applied statistics
What are the types/branches of Statistics?
descriptive statistics and inferential statistics.
consists of methods for organizing,
displaying, and describing data by
using tables, graphs, and summary
measures
DESCRIPTIVE STATISTICS
consists of methods that use sample
results to help make decisions or
predictions about a population from a
sample.
INFERENTIAL STATISTICS
is the branch of applied statistics
directed toward applications in the health sciences and biology.
Statistical Biology/Biostatistics
a specific subject of object (e.g. a
person, a company, a state, or country) about which the information is collected.
element/ member of a sample or population
a characteristic under study that assumes different values of different
elements.
variable.
the value of a variable for an element.
An observation or measurement
a collection of observations on one or more variables
A data set
results when a single variable is measured.
Univariate data
results when two variables are measured.
Bivariate data
the collection of all elements–individuals, items, or objects–whose
characteristics are being studied.
population
results when more than two variables are measured.
Multivariate data
the collection of a number of elements selected from a population. It is a
subset selected from the target population.
sample
the collection of information that includes every
member of the population.
census
the collection of information from the
elements of a sample.
sample survey
a numerical measure that summarize data for an entire population.
parameter
a numerical measure that summarize data from a sample.
statistic
a method of
sampling in which each member of the
population has some chance of being
selected in the sample.
Random sampling
a method of
sampling in which some member of the
population may not have any chance of
being selected in the sample.
Nonrandom sampling
Two types of nonrandom sampling
convenience sampling and a judgment sampling.
the most accessible members of the population are selected
to obtain the results quickly.
convenience sampling
the members are selected from the population based on the
judgment and prior knowledge of an expert.
judgment sampling
are used to
obtain a random sample that represents the
target population.
Random sampling techniques
is a sampling technique in
which any particular sample of a specific sample size has
the same chance of being selected as any other sample
of the same size.
Simple random sampling
is the number of elements in the sample,
denoted by n.
Sample size
denoted by N,
is the number of elements in the population.
population size
is a sampling technique in
which the elements of the sample are taken from every
kth element in the population arranged alphabetically or
by other characteristic. Here, k = 𝑁/𝑛 .
Systematic random sampling
is a sampling
technique in which the entire population is divided
into smaller groups (called strata; stratum in
singular) that are not overlapping and represent
the entire population.
Stratified random sampling
is a sampling technique in which the
entire population is divided into multiple groups (called
clusters) usually by geographical area.
Cluster sampling
are variables that can be measured numerically. These
variables are collected in quantitative data such as income, height, gross sales, price of a
home, number of cars owned, and a number of accidents.
Quantitative or Numeric variables
is a variable whose values are countable with no possible
intermediate values between consecutive values.
discrete variable
is a variable that can assume any numerical value between
two numbers. Weight is an example of the variable since it can assume
any value
continuous variable
are variables that cannot be measured numerically can
be divided into different categories. These variables are collected in a qualitative data. Civil
status is an example of a qualitative variable which can take the values “Single”, “Married”,
“Widowed”, or “Separated” – nonnumeric values.
Qualitative or categorical variables
is a data
collected on different elements at
the same point or for the same
period of time.
Cross-section data
is a data
collected on the same element of
the same variable at different
points or for different period of
time.
Time-series data
(pronounced sigma) is used to denote the sum of all values.
The uppercase Greek letter Σ
the average of the given numbers and is calculated by dividing the sum of given numbers by the total number of numbers.
mean
the middle number in a sorted ascending or descending list of numbers and can be more descriptive of that data set than the average
mediam
number in a set of numbers that appears the most often.
mode
number in a set of numbers that appears the most often.
mode