Section 23-24 Sampling Flashcards
Sampling
SAMPLING refers to the use of a random subset of an entire population being studied in order to make generalizations/predictions about the overall population.
- Because we use the Sample data to INFER (come to a conclusion based on available evidence) characteristics about the greater population, the mean, standard deviation, etc. from the sample data are known as INFERENTIAL STATISTICS.
- Remember Parameters refer to Populations while STATISTICS refers to SAMPLES.
- SAMPLES are studied instead of entire populations because:
- Data from the entire population might not be available or too difficult/expensive to access in its entirety.
- Sampling allows you to work with a smaller set of data, which would be easier to work with.
- Sampling can still provide highly reliable results within a margin of error or within a desired confidence level
- Because Samples are only a PART of the population, any predictions about the population made by using the sample will be IMPERFECT and so, will come with a MARGIN of ERROR.
- The SMALLER the SAMPLE is relative to the POPULATION, the LARGER the MARGIN of ERROR.
The MOST IMPORTANT CHARACTERISTIC of a GOOD SAMPLE is that it be FREE FROM BIAS, which exists whenever some members of a population have a greater chance of being selected for inclusion in the sample than others.
- The selection MUST BE TOTALLY RANDOM, avoiding sampling bias errors like SAMPLES of CONVENIENCE (or ACCIDENTAL SAMPLES) and VOLUNTEERISM.
Sampling Bias
The MOST IMPORTANT CHARACTERISTIC of a GOOD SAMPLE is that it be FREE FROM BIAS, which exists whenever some members of a population have a greater chance of being selected for inclusion in the sample than others. The selection MUST BE TOTALLY RANDOM.
- These ERRORS in SAMPLE SELECTION can make the study UNRELIABLE:
-
SAMPLES of CONVENIENCE (or ACCIDENTAL SAMPLES) occur when the sample is chosen out of a matter of convenience.
- Ex: Want to get the opinion of Gun Owners in America, but sampling ONLY gun owners in your town because those are the ones you have access to.
-
VOLUNTEERISM – refers to the unwillingness of potential sample members to participate in the study. This creates a bias because there are differences between those who tend to participate and those who don’t.
- In fact, studies have shown that PARTICIPANTS tend to be more highly educated and wealthier than those who refuse to participate.
- Efforts to reduce the effects of volunteerism include offering rewards, stressing to potential participants the importance of the study, and making it easy for people to respond, such as by providing them with a self-addressed, stamped envelope.
Random Sampling
To eliminate bias, some form of RANDOM SAMPLING is needed. A classic form of random sampling is SIMPLE RANDOM SAMPLING – essentially some equivalent of putting every name in the population into a hat and blindly choosing some from the hat to serve as the sample. Each member of the population is given an equal chance to be selected.
- If even one of those chosen names refuses to participate (which is very common), the sample is already biased due to VOLUNTEERISM.
- Suppose everyone participates, giving us an UNBIASED SAMPLE. Even then, we can NOT be certain that the results we obtain from the sample accurately reflect those we would have obtained by studying the population. This lingering UNCERTAINTY is due to RANDOM ERROR or SAMPLING ERROR, which is when, by chance, we may have selected a group that is NOT indicative of the overall population – perhaps they are disproportionately Democrats, males, low-SES-group members, and so on.
- SAMPLING ERROR is simply something you have to deal with. Luckily…
- Inferential statistics allow researchers to estimate the amount of error to allow for when they are interpreting the results from unbiased samples, and
- The amount of ERROR obtained from unbiased samples is SMALL when LARGE SAMPLES are used.
Sample size
HOW LARGE SHOULD THE SAMPLE SIZE BE?
That depends on THREE CRITERIA
-
The larger the sample, the better, but increasing sample size produces diminishing returns.
- Ex: Increasing a sample from 100 to 200 will have a MUCH greater effect on reducing sampling errors than will an increase from 3,000 to 3,100. (For context, most national surveys use samples of only 1,500 to 2,000 participants).
-
When there is little variability in a population, even a small sample may yield highly accurate results.
- Ex: if you take a random sample of eggs that have been graded as “extra-large” and weigh them, you will probably find only a small amount of variation among them. For this population, a small random sample should yield an accurate estimate
- When there is much variability in a population, small samples may produce data with much error, so you’ll need a larger sample size.
Often, it is impractical or impossible to use a random sample, yet information is needed from a sample in order for a decision to be made. In these cases, QUOTA SAMPLING may be used. This refers to the use of previously collected data to profile a TYPICAL user and then finding participants with that profile to take part in the study.
- Ex: Suppose Coke has a new soda it wants to test nationwide, but doing such a test would be prohibitively expensive and time-consuming. Instead, they might use QUOTA SAMPLING – using national statistics on what the TYPICAL soda drinker is like in terms of gender, ethnicity, and income, Coke would simply find a sample that fits that profile and test the product with them.
- Notice that this technique is not random and is subject to bias. Even if the manufacturer has the correct proportion of males, for instance, the males selected might tend to be from a particular region of the United States that has different tastes than the national population of male soda drinkers. Thus, the results of quota sampling should be viewed with skepticism.
IMPORTANT: Strictly speaking, the inferential statistics in the rest of this book should be applied only to data obtained from random samples. In practice, however, they are sometimes not.
Also, sometimes biased samples MUST be used because there is no other choice.
Other Types of Sampling
The following are types of Random Sampling and the situation in which they would be used:
- SIMPLE RANDOM SAMPLING – (Small Population size) Blindly drawing a sample of names out of a hat filled with the names of everyone in the population.
- TABLE of RANDOM NUMBERS – (for larger populations) Number the participants from 01 to 90 (if there are 90 of them), then randomly point at a section of Table 2. at the back of the book. Read the first 2 digits and that person will be selected, then the next two digits, and so on.
-
STRATIFIED RANDOM SAMPLING – (usually superior to simple random sampling) – the population is first divided into subgroups (strata) that are believed to be relevant to the variable(s) being studied. This should DECREASE RANDOM BIAS.
-
Ex: Suppose, for instance, that you wanted to conduct a survey of opinions on women serving in political leadership roles. If you suspected that men and women might differ in their opinions on this issue, then first stratify the population according to gender and then draw separately from each stratum at random.
- The same percentage should be drawn from each stratum (i.e., each subgroup). For instance, if you wanted to sample 10% of a population consisting of 1,600 men and 2,000 women, you would draw 160 men and 200 women.
- IMPORTANT: The purpose of stratifying is NOT to compare men with women. Rather, the purpose is to obtain a sample of the entire population that is representative in terms of gender.
- And this is ONLY relevant because you believe the outcomes of men as a group differs, in general, from women as a group.
- The same percentage should be drawn from each stratum (i.e., each subgroup). For instance, if you wanted to sample 10% of a population consisting of 1,600 men and 2,000 women, you would draw 160 men and 200 women.
-
Ex: Suppose, for instance, that you wanted to conduct a survey of opinions on women serving in political leadership roles. If you suspected that men and women might differ in their opinions on this issue, then first stratify the population according to gender and then draw separately from each stratum at random.
-
MULTISTAGE RANDOM SAMPLING – (For large-scale studies) From multiple Broad Categories, choose a sampling of subgroups, and from those subgroups, choose even smaller subgroups to reach a representative sample with a workable sample size.
- Ex: You might draw a sample of counties at random from all counties in the country, then draw voting precincts at random from all precincts in the counties selected, and finally draw individual voters at random from all precincts that were sampled.
- Ex. 2 Alternatively, you could first stratify the counties into rural, suburban, and urban, and then separately draw counties at random from these three types of counties-ensuring that all three types of counties are included.
-
CLUSTER SAMPLING – For cluster sampling to be used, all members must belong to a cluster. (Ex: all Boy Scouts belong to a troop, all students belong to a homeroom, and so on).
- Unlike simple random sampling, in which individuals are drawn, in cluster sampling, clusters (i.e., groups) are drawn.
- Ex: To conduct a survey of Boy Scouts, for instance, you could draw a random sample of troops, contact the leaders of the selected troops, and ask them to administer the questionnaires.
-
Advantages are:
- fewer people to contact
- degree of cooperation is likely to be greater if a leader asks the scouts to participate.
-
Disadvantage:
-
Clusters often are homogeneous in some way. For instance, suppose that you drew 10 troops (i.e., clusters) at random, and 9 of them, by chance, were in major urban areas. (Note that scouts in major urban areas may have different attitudes and skills from those in rural areas.)
- Thus, when using cluster sampling, it is desirable to use a large number of clusters to overcome this disadvantage.
-
Clusters often are homogeneous in some way. For instance, suppose that you drew 10 troops (i.e., clusters) at random, and 9 of them, by chance, were in major urban areas. (Note that scouts in major urban areas may have different attitudes and skills from those in rural areas.)
-
Advantages are:
- Ex: To conduct a survey of Boy Scouts, for instance, you could draw a random sample of troops, contact the leaders of the selected troops, and ask them to administer the questionnaires.
- Unlike simple random sampling, in which individuals are drawn, in cluster sampling, clusters (i.e., groups) are drawn.