Lect 1 - Biostats Flashcards
sampling types of data describing data measure of central tendency and variability identify role of stats differet types of data sampling types calculate and interpret central tendency and distro
Why random sampling
evens out biases
why sample
ideally study everyone
but cost too much
therefore use sample that you hope is representative of everyone
what is probability sample
chosen so you can determine how likely it is that an individual from population is included in that sample
what are the 4 different probability sampling methods
- simple random sampling - randomly chosen, e.g. calling phone books. every individual has equal chance of being selected
- systematic sampling - every 3rd person, this spreads samples more evenly, and easier to conduct - but this may not always be random
- stratified - e.g. by age or gender, so you can have equal numbers - then randomly chose - this gives greater precision, but bit more complex, can over sample, e.g. older males
- cluster sampling - cluster chosen from population, then sample randomly taken from each cluster. e.g. different streets, then randomly sample from the street - can introduce bias such as socioeconomic status
whats non-probability sample
impossible to determine the likelihood that an individual comes from the population
what are some non-prob sampling methods
- convenience sampling - ask med students to participate
- quota sampling - fill quota
- purposeful sampling - exemplify
- snowball sampling
2 types of data
- categorical
2. measurement (continuous)
4 scales of measurements
nominal - gender
ordinal - rate of health
interval - temperature (no real 0)
ratio - height (true 0)
what are the differences between descriptive and inferential
- descriptive
2. inferencial
advantage/disadvantages of mode
Represents the largest number of people. Is a real score from the data set Can be used with nominal data. May not be particularly representative of the entire data set. Cannot be manipulated algebraically Not as stable across samples as mean.
median
Relatively unaffected by extreme scores (useful for skewed distributions). Can be used with ordinal data. Cannot be manipulated algebraically. Not as stable across samples as the mean.
mean
Can be manipulated algebraically. Most stable estimate of central tendency across samples (i.e. less variance across multiple samples). Influenced by extreme scores. Its value might not exist in the data. Should only be used with interval or ratio data.