STATS (BIOL 243) FALL 2024 Flashcards
five Hierarchical scales
sample unit
sample
observation unit
statistical population
population of interest
sampling unit
the unit being selected at random, it may be the same as the observation unit or contain multiple observation unit
sample
collection of the sampling units
observation unit
scale of data collection, subject of study
statistical population
collection of all sampling units that could have been in your sample, and represents the true scale in which your statistical conclusions are valid
population of interest
collection of sampling units that you hope to draw conclusion about
scope of the research question
ideally the same as your statistical population
measurement variable
what we want to know/measure about the observation unit
measurement unit
scale
descriptive stats
set of tools used to describe data
inferential statistics
uses information from the sample to make a probolistic statement about the statistical population
what is the rule for descriptive and inferential stats when there are multiple groups i a statistical population
descriptive stats are repeated for each group but inferential stats are only done once and can be used to make statements about the differences between groups
ideal sampling design
- all sample units have a probability of being included
- selection of sampling units must be unbiased
- selection of sampling units are independent
- each possible sample has an equal chance of being selected
observational studies
- researchers have no control over the variables
- it characterizes something about an existing statistical pop
- a tool for discovering associations, but can not make statements about the involvement of the sampling unit (cannot establish causation cause there is no way to know if the the factor is governed by something else
response variables
variable the investigators are interested in
explanatory variable
variable that the investigator believes may explain the response variable
confounding variables
unobserved variables that affect the response variable
simple random
starts by identifying every sampling unit in the statistical population and then selecting a random subset of those to be in your sample. Each sampling unit has the same probability of being included in your sample.
stratified
used when the statistical population has some grouping (strata)
clustered
observation units are contained within a larger group that we can randomly sample (geographicl or organizational)
case control
when there is a known outcome we are trying to explain
cohort
select a sampling unit, follow them through time to see if they developed the result we want
retrospective
studies where the results are already known
ie. case control studies
prospective
outcome is not yet known
ie. cohort studies