Week 6 - Sampling and Power Estimation Flashcards
what is population (P of PICO)?
the entire aggregation of cases in which a researcher is interested.
what is a sample?
a subset of a population participating in a research study.
what are the basic sampling concepts in quant research?
- Researchers use a sampling/sample plan to obtain an accessible sample
- Accessible sample is based on designated criteria
- Helps us to understand the broader target population for which to generalize
what is sampling?
the process of selecting cases to represent a desired population
what is a representative sample?
one whose characteristics closely approximate those of the population. Example: gender or age group
what is eligibility or inclusion criteria for sampling?
the defined attributes of a target population. Example: diagnosis, age group, practice constraints of convenience, people’s ability/interest to participate, research design considerations (i.e. placebo vs drug), presence of symptoms (i.e. migraine aura).
what are exclusion criteria in sampling?
the characteristics of a population that people must not posses. Example: you wish to study premature infants’ reactions to an intervention; you would want to exclude full-term infants by age and/or weight criteria.
explain the sampling plan
need to think about how many subjects will be selected and how many to include
what are the two key goals in sampling?
- Representativeness
- Adequate size
what is strata in sampling?
subpopulations within the overall population. It is comprised of 1 or more characteristics that are mutually exclusive. Example: College degree or no college degree; Illicit drug use or no illicit drug use.
what is staged sampling?
sampling that is accomplished over multiple stages. Example: Stage 1=census records within indicated counties; Stage 2=Medicare enrollees; Stage 3=selecting individuals >65 with a diagnosis of COPD.
what is sampling bias?
the systemic overrepresentation or underrepresentation of a population subgroup on a characteristic relevant to the research question. This may happen in the short term due to costs, practicality with no big effects. However if eligibility-inclusion criteria are continuously restricted over a longer period of time, this can impact generalizability of entire segments of the population.
what is a sampling error?
differences between sample values (i.e. HDL cholesterol in a sample of Type 2 diabetics) and population values (overall HDL average in the U.S. population [norm referenced for age]).
what is probability sampling?
involves random selection of elements, which ensures greater confidence in representativeness
what is non-probability sampling?
uses selection by non-random methods
what are the key factors for probability sampling?
- Samples are randomly selected
- Everyone in the population has an equal chance of being selected
- Used to control sampling bias
- Useful when focus is on population diversity
- Used when researcher needs to ensure accuracy
- Finding correct target population is not simple
what are the key factors of non-probability sampling?
- Samples are selected based on researcher’s judgement
- Not everyone has an equal chance to participate
- Sampling bias is not a primary concern
- Useful in environment that shares similar traits
- Does not help representation of population accurately
- Finding target population is very simple
when do you use probability sampling?
- when you want to reduce the sampling bias
- when the population is usually diverse
- to create an accurate sample
what are the different types of probability sampling?
- simple random sampling
- stratified random sampling
- cluster sampling
- systemic sampling
what is simple random sampling?
an entirely random method of selecting the sample. This sampling method assigns numbers to all the individuals in the population (sample) and then randomly chooses from those numbers through an automated process. The numbers chosen are the members included in the sample. (most basic probability sampling)
what is stratified random sampling?
involves a method where the researcher divides a more extensive population into smaller homogenous (and often unequal) groups that usually don’t overlap (strata) but represent the entire population. While sampling, subgroups are organized, and then a random sample is drawn from each group separately. Example: a school comprising of 1000 students has the following breakdowns: 20%AA, 20%Hispanic, 10% Asian, and 50% White students. So a stratified sample would draw 20, 20, 10 and 50 students from each respective strata groups.
what is cluster sampling?
selection of broad groups (clusters) rather than selecting individuals. Each cluster usually involves a population with its own elements. Example: hospitals, universities, country, state or region. Then the researcher creates smaller subunits: family, city, school programs and departments.
what is systemic sampling?
selecting every ‘nth’ individual or ‘k-th’ case on a list or roster. It is an extension of the simple random sampling method. There is an equal opportunity for every member of a population to be selected using this technique and it is time and cost efficient.
what is multi-stage probability design?
uses random selections at various stages. For example, non-random oversampling of certain groups occurs to emphasize selection of vulnerable and marginalized population segments including: AAs, Mexican Americans, Low income White Americans, Adolescents aged 12-19 and Persons > 60 years old.
what are types of non-probability sampling?
- convenience sampling
- consecutive sampling
- judgmental or purposive sampling
- quota sampling
- snowball sampling
what is non-probability sampling?
selection of participants, groups, procedures uses non-random procedures
what is convenience sampling?
recruit from the most conveniently available participants. Downside is that those who participate may be atypical of the population on the study outcomes. If it’s something straightforward (i.e. surveys and vital signs) it’s not too bad, but evaluation of management and treatment outcomes in complex diseases (i.e. older adult frailty with comorbidities in variable geographical niches), it is much more difficult.
what is consecutive sampling?
recruiting all the people from an accessible population who meet the eligibility criteria over a specific time interval or for a specified sample size. Protects against temporal effects such as seasonal variation, or time of day fluctuations.
what is purposive sampling?
uses researchers’ knowledge about the population to make selections. Researchers select the samples based on the researcher’s knowledge and credibility. In other words, researchers choose who they deem fit to participate in the research study. Downside is that the preconceived notions of a researcher can influence the results
what is quota sampling?
sample is selected using quotas for certain subgroups based on population proportions to increase the representativeness of the sample. The subgroups are based on strata definitions (i.e. age, gender) to ensure minimum representation of at-risk or vulnerable groups are included. Convenience sampling would then occur until the quotas for each population strata are met.
what is snowball sampling?
a variant of convenience sampling and helps researchers find a sample when they are difficult to locate. Researchers use this technique for small sample sizes that are not easily available. Early participants are asked for referrals to others who are similar in the population that are also difficult to identify (i.e. those impacted by rare diseases or experiences).
sampling plans in NR should be examined with respect to…
- Approach used
- Study population and eligibility criteria
- Number of participants and rationale for sample size
- Inclusion of power analysis
- Description of main characteristics of the sample
- Number and characteristics of participants who declined participation or dropped out
what are threats to statistical conclusion validity?
- Low statistical power
- Effect size—small, moderate effects need a larger sample size
- Heterogeneity of the population
- Cooperation (Fidelity to the intervention)
- Attrition (Burden, time, costs, non-response rates)
what are the steps in sampling in quant research?
- Identification of the population
- Specification of the eligibility criteria
- Specify the sampling plan
- Recruit the sample: screening instrument(s)
- Generalizing from samples
what is a power analysis (A-Priori)?
a mathematical procedure used to estimate sample size requirements
why are sample size calc important?
- A small study can generate inconclusive or spurious results
- Studies w/inadequate sample size are often unethical:
- Sample size justification is a standard part of an interventional quantitative research study design, period.
- Required by FDA, grant funders, publishers, etc.
studies with inadequate sample size are often unethical because..
- too small and the information is not useful;
- too large and it wastes scarce $, time, staff, animals or exposes too many participants to an invasive treatment, drug, device.
when do you use a power analysis?
When you need to define an effect of an intervention on a specific group of people with a defined trait or condition that is under study.
what are the required elements for sample size calculations (general)?
- Significance level desired (α)
- Power level of test desired (1-β)
- Desired sample size (n)
- Effect size desired (d) or (Cohen’s d) - Magnitude of relationship between variables (Small effect = 0.20, Moderate-Medium effect = 0.50, Large effect = 0.8)
what is a null hypothesis?
H0 = there are no mean differences between groups
what is a type l error?
- rejecting H0 when it is true
- Concluding a relationship exists when in fact it does not
- FALSE POSITIVE
what is a type ll error?
- accepting the H0 when it is false
- Concluding no relationship exists when in fact it does
- FALSE NEGATIVE
what drives a type l error?
Probability of committing a false positive
what does a 2-tailed test examine?
both sides of the distribution
what does a 1-tail test examine?
stipulates a specific direction
what is an alternative hypothesis?
there are mean differences between groups
how do researchers control for type l errors?
- error by setting α at a level they are comfortable with
- Usually set at .05 or .01
what do researchers do to control for type ll error?
- by setting power (1-β) at 80% or 20% risk of committing a Type II error
- They set an effect size that has precedent in the literature (i.e. Std. Dev)
t/f - a smaller number of participants is required to detect a small effect, and has much less statistical power
false - a larger number of participants is required to detect a small effect, and has much less statistical power
What is a target population?
A) the aggregate of cases about which the researcher would like to generalize.
B) the aggregate of cases that conform to designated criteria and that are available for a study.
C) the characteristics of target individuals that meet the specific population characteristics.
D) the characteristics of a population that must not be included in the research sample.
A) the aggregate of cases about which the researcher would like to generalize.
Which of the following is an example of a probability sampling method? A) Convenience B) Cluster C) Purposive D) Quota
B) Cluster
What type of sampling divides the population into homogeneous subsets from which elements are selected at random? A) Cluster sampling B) Probability sampling C) Simple random sampling D) Stratified random sampling
D) Stratified random sampling
Which of the following best describes systematic sampling?
A) Dividing the population into homogeneous subsets to ensure representation
B) Multi-staged selection of random samples from larger units
C) Selecting every kth case from a list
D) Using strata to determine how many participants are needed
C) Selecting every kth case from a list
define significance criterion
alpha (α)= “significance level” the probability of rejecting a null hypothesis, even though it is true; Type I error
what is the letter used to refere to sample size
n
what is effect size
the magnitude of the relationship between research variables (γ). Larger population size to find the differences
define power in terms of a power analysis
the probability of rejecting the null hypothesis, when the null is actually false, Type II error