Descriptive Statistics Flashcards
simple random sample
every person in the population has an equal chance of being selected
stratified random sample
population is separated into similar strata and then random samples are drawn from each group
cluster sample
“cluster” is selected at random and then a sample is chosen from the cluster
systematic sample
selecting subjects in a systematic way
example of a cluster sample
randomly choosing 5 schools then choosing random students from each school
example of systematic sample
every 3rd baby born, etc
systematic samples may be prone to ___
selection bias
probability: definition, range
chance that a particular event will occur
ranges from 0-1
addition rule, assumption
the probability of Y or X = probability of X + probability of Y
assumes X and Y are mutually exclusive
multiplication rule, assumption
the probability of X and Y = probability of X x probability of Y, assumes the two variables do not influence each other
Nominal variables
categories
race, gender, color
Ordinal variables
placed in meaningful order, but no consistent interval between
grades of cancer, pain scores
Interval variables
placed in meaningful order with intervals in between but no absolute zero
degrees C
ratio variables
placed in meaningful order with intervals in between and absolute zero
relative frequency distribution shows:
percentage
percentile definition
gives the percentage of observations that fall below the particular value (gives information about one score in relation to all of the other scores)
mean
average
mean is very sensitive to:
extreme scores
median
divides data into 2 equal parts
mode
the most frequently occurring value
bimodal
when 2 scores occur with the same frequency
variance equation
S2 = E(xi-x)2 / n-1
standard deviation
square root of variance (s)
coefficient of variation
standard deviation/mean
mean, SD and AUC in normal distribution
mean= 0 SD = 1 AUC = 1
skewed frequencies are named for their-
tail
Z score indicates:
how many SD away from the mean an element is (in normal distribution)
Z =
X - mean / standard deviation
Z score is used to estimate:
the probability of an element being above or below a particular score
% in 1 SD, 2 SDs, 3 SDs
1- 68%
2- 95%
3- 99.7%