PU520: Principles of Epidemiology Unit 6 Epidemiology and Data Presentation Flashcards
What does the term population refer to?
Refers to a collection of people who share common observable characteristics.
What are other ways populations can be demarcated? Just some examples.
- All of the inhabitants of a country (e.g., China)
- All of the people who live in a city (e.g., New York)
- All students currently enrolled in a particular university
- All of the people diagnosed with a disease such as type 2 diabetes or lung cancer
What is a variable for describing a characteristic of a population and is defined as a measurable attribute of a population?
Parameter
An example of a parameter is the average age of the population, designated by the symbol μ.
Returning to the average age of a population (μ), the sample estimate of μ is denoted by X (the sample mean). Inferential statistics use sample-based data to make conclusions about the population from which a sample has been selected; this process is known as estimation. Thus, X can be used as an estimate for μ, the population mean (a parameter).
What is the goal of statistical inference?
To characterize a population by using information from samples.
Thus samples must be representative of their parent population.
What does representativeness mean in regards to characteristics of a sample of a population?
Representative-ness means that the characteristics of the sample correspond to the characteristics of the population from which the sample was chosen.
What is a subgroup that has been selected, by using one of several methods, from the population (universe)?
Sample.
In the terminology of sampling, the universe describes the total set of elements from which a sample is selected.
What are numbers that describe a sample?
Statistics
List of Important Terms (Review)
N/A
What are two rationales for using samples to represent a population?
Improved parameter estimates and cost savings
Examples include reviewing income tax returns, verifying signatures on ballot initiatives, quality of manufacturing goods, and enumerating the U.S. population.
What are the two ways in which sampling occurs?
Random sampling and nonrandom sampling
What are the different ways of random sampling? (2)
Simple random sampling and stratified random sampling
What are the different ways of nonrandom sampling? (3)
Convenience sampling, systematic sampling, cluster sampling
What does it mean when nonrandom sampling is prone to sampling bias?
Sampling bias means that the individuals who have been selected are not representative of the population to which the epidemiologist would like to generalize the results of the research.
What are surveys on the internet and media-based polling examples of in sampling?
nonrandom sampling
These two methods are likely to produce nonrepresentative samples. Increasingly, the Internet has been used for conducting surveys; the resulting sample of respondents is likely to be a biased sample because of self-selection—only people who are interested in the survey topic respond to the survey. We do not know about the nonrespondents and consequently have very little information about the target population (the population denominator, as it is called in epidemiology)
What refers to the use of a random process to select a sample?
Simple random sampling (SRS)
A simplistic example of SRS is drawing names from a hat. Random digit dialed (RDD) telephone surveys are a more elaborate method for selecting random samples. At one time, RDD surveys obtained high response rates from the large proportion of the U.S. homes with telephones. However, as more people transition from land lines to cellular phones, RDD surveys of land-based telephones have had declining population coverage and reduced response rates.
Another method of SRS is to draw respondents randomly from lists that contain large and diverse populations (e.g., licensed drivers). In simple random sampling, one chooses a sample of size
n from a population of size N. Each member of a population has an equal chance of being chosen for the sample. In addition, all samples of size n out of a population of size N are equally possible. Considerable effort surrounds the determination of the size of n.
According to statistical theory, what produces unbiased estimates of parameters?
Random sampling.
In addition, random sampling permits the use of statistical methods to make inferences about population characteristics. In the context of sampling theory, the term unbiased means that the average of the sample estimates over all possible samples of a fixed size is equal to the population parameter.
For example, if we select all possible samples of size n from N and compute X
for each sample, the mean of all of the X -
s (symbol, μx–) will be equal to μ (μx–(X -
= μ). However, any individual sample mean is likely to be slightly different from μ. This difference is
from random error, which is defined as error due to chance.
Beware, therefore, that the unbiasedness property of random samples does not guarantee that any particular sample estimate will be close to the parameter value; also, a sample is not guaranteed to be representative of the population.
When wanting to conduct random sampling of a subgroup of a population, what type of technique is used so the data represents the subgroup of interest versus the greater population in which it belongs?
Stratified random sampling
Returning to statistical terminology, we will designate N
as the number in the population and n as the number in the sample. Suppose an epidemiologist wants to study the health characteristics of racial or ethnic subgroups that are uncom-mon in the general population. The size of n is limited by our available budget. If n is small (which is often the case) in comparison to N, then only a few individuals from the minority group will enter the sample.
What word do we use to define a subgroup of a greater population?
Stratum
For example, a population can be stratified by racial or ethnic group, age category, or socioeconomic status. Stratified random sampling uses oversampling of strata in order to ensure that a sufficient number of individuals from a
particular stratum are included in the final sample.
Statisticians have demonstrated that stratified random sampling can improve parameter estimates for large, complex populations, especially when there is substantial variability among subgroups.
What is nonrandom sampling that uses available groups selected by an arbitrary and easily performed method?
Samples generated by this sampling method are sometimes called “grab bag” samples.
Convenience sampling.
An example of a convenience sample is a group of patients who receive medical service from a physician who is treating them for a chronic disease. Convenience samples are highly likely to be biased and are not appropriate for application of inferential statistics. However they can be helpful in descriptive studies and for suggesting additional research.
What nonrandom sampling uses a systematic procedure to select a sample of a fixed size from a sampling frame (a complete list of people who constitute the population)?
Systematic sampling
Systematic sampling is feasible when a sampling frame such as a list of names is available.
As a hypothetical example of systematic sampling, an epidemiologist wants to select a sample of 100 individuals from an alphabetical
list that contains
2,000 names.
A way to determine the sample size is to select a desired percentage of cases (e.g., 5%). After specifying a sample size, a sampling interval must be created, say, every tenth name.
An arbitrary starting point on the list is identified (e.g., the top of the list or a randomly selected name in the list); then from that point every tenth name is chosen until the quota of 100 is reached.
Why might nonrandom systematic sampling like choosing 5% of names (of a list of 2000) and then starting at the top and choosing every 10th name not be representative of the population?
The 5% may be filled before reaching the end of the first third of the sample, which may exclude minorities with certain names at the bottom of the list. This sample would be biased.
What is the sampling technique, cluster sampling, mean?
Cluster sampling refers to a method of sampling in which the element selected is a group (as distinguished from an individual) called a cluster.
An example of a selected element is a city block (block cluster). The U.S. Census Bureau employs cluster sampling procedures to conduct surveys in the decennial census. Because it is a more parsimonious design than random sampling, cluster sampling can produce cost savings; also, statistical theory demonstrates that cluster sampling is able to create unbiased estimates of parameters.