Module 2 Flashcards
What are the four goals that all sampling designs must achieve?
- all sampling units are selectable: every sampling unit in the statisitcal population must have some non zero probability of being included in the sample
- selection is unbiased: selection of a sampling unit cannot depend on any attribute
- selection is independent : selection of a specific sampling unit must not increase or decrease the probability that any other sampling unit is selected
- all samples are possible
what is bias?
an over or under estimate of some value from an average sample compared to the statistical population
What is sampling independence?
Sampling independence is when selection of one sampling unit does not influence the probability that any other sampling unit is selected.
How to identify errors in sampling design?
prevent problems at the beginning of a study by evaluating proposed sampling designs against the four criteria/goals
What is the primary goal of an observational study?
characterize something about an existing statistical population, collect data from an exisiting statistical population that allows us to investigate relationships among variables
what is an observational study?
a study using observations from a statistical population where the investigator has no control over the explanatory variables
what is a drawback of an observational study?
- provides a tol for discovering associations, but cannot make statements about whether a factor causes the response you are interested in
- while we can look at relationships among variables, these relationships might not be casual
What is the response variable?
Response variable a variable that the investigator is interested in studying as a way to answer a research question
What is the explanatory variable?
Explanatory variable a variable that an investigator believes may explain the response variable
What are confounding variables
unobserved variables that affect a response variable
What are spurious relationships?
when the relationship between an explanatory variable and response variable is thought to be driven mostly by a confounding variable, the relationship is called spurious
What is a simple random survey?
- observational study design
- start by identifying every sampling unit in the statistical population and then selection a random subset of those to be in your sample, each sampling unit has the same probability of being included in your sample
What is a stratified survey?
- observational study design
- used when there are subgroups within the statistical population that can influence the study results
- first break the statistical population into strata (a subgroup) and then sample within each strata (each strata has equal weighing in the sample)
- the strata are defined ahead of time by the researcher
What is a cluster survey?
- observational study design
- used by researches to remove diversity in the statistical population that is not relevant to the research question
- the idea is to create groups where the non relevant diversity is contained within each group (this group is called a cluster)
- the group is called a cluster, which is selected at random from all possible clusters
- a cluster is the sampling unit and observation unit
- one stage: data are collected from all observation units in a cluster
- two stage: surveys a subset of observation units are randomly selected within each cluster
What is a case control survey?
- type of observational study design
- used to compare data between two groups
- case group: contains sampling units with a particular response variable
- control: contains sampling units without the response variable of the case group
- this type of sampling is purposefully baised in that it aims to select sampling units for the case group based on a measured response variable and compare that to the control group