Chapter 4 Vocab Flashcards
Population
In a statistical study, the population is the entire group of individuals about which we want information
Sample
The part of the population from which we actually collect information. We use information from a sample to draw conclusions about the entire population
Census
A study that attempts to collect data from every individual in the population
Sample Design
Plan created to capture a sample of the population
Voluntary Response Sample
People decided whether to join a sample based on an open invitation; particularly prone to large bias
Convenience Sample
A sample selected by taking the members of the population that are easiest to reach; particularly prone to large bias
Bias
The design of a statistical study shows bias if it systematically favors certain outcomes
Simple Random Sample (SRS)
The basic random sampling method. An SRS gives every possible sample of a given size the same chance to be chosen. We often choose an SRS by labeling the members of the population and using random digits to select the sample.
Stratified Random Sample
To select a stratified random sample, first classify the population into groups of similar individuals, called strata. Then choose a separate SRS from each stratum to form the full sample
Cluster Sample
To take a cluster sample, first divide the population into smaller groups. Ideally, these clusters should mirror the characteristics of the population. Then choose an SRS of the clusters. All individuals in the chosen clusters are included in the sample
Undercoverage
A sampling error that occurs when some members in the population are left out of the process of choosing the sample
Nonresponse
Occurs when a selected individual cannot be contacted or refuses to cooperate; an example of a non sampling error
Response Bias
A systematic pattern of incorrect responses in a sample survey.
Wording of Questions
The most important influence on the answers given to a survey. Confusing or leading questions can introduce strong bias, and changes in wording can greatly change a survey’s outcome. Even the order in which questions are asked matters
Observational Study
Observes individuals and measures variables of interest but does not attempt to influence the responses
Experiment
Deliberately imposes some treatment on individuals to measure their responses
Lurking Variable
A variable that is not among the explanatory or response variables in a study but that may influence the response variable
Confounding
When two variables are associated in such a way that their effects on a response variable cannot be distinguished from each other
Experimental Units
The smallest collection of individuals to which treatments are applied
Subjects
Experimental units that are human beings
Treatment
A specific experimental condition applied to the units
Factors
The explanatory variable in an experiment
Level
A specific value of an explanatory variable (factor) in an experiment
Random Assignment
The experimental values are assigned to treatments at random, that is, using some sort of chance process
Control Group
An experimental group whose primary purpose is to provide a baseline for comparing the effects of the other treatments. Depending on the purpose of the experiment, a control group may be given a placebo or an active treatment
Placebo
An inactive (fake) treatment
Control
An important experimental design principle. Researchers should control for lurking variables that might affect the response by using a comparative design and ensuring that the only systematic difference between the groups is the treatment administered
Replication
An important experimental design principle. Use enough experimental units in each group so that any differences in the effects of the treatments can be distinguished from chance differences between the groups
Double Blind
An experiment in which neither the subjects nor those who interact with them and measure the response variables know which treatment a subject received
Single Blind
An experiment in which either the subjects or those who interact with them and measure the response variable, but not not both, know which treatment a subject received
Statistically Significant
An observed effect so large that it would rarely occur by chance
Block
A group of experimental units that are known before the experiment to be similar in some way that is expected to affect the response to the treatments
Randomized Block Design
Start by forming blocks consisting of individuals that are similar in some way that is important to the response. Random assignment of treatments is then carried out separately within each block
Explanatory Variable
A variable that helps explain or influences changes in a response variable
Response Variable
A variable that measures an outcome of a study
Placebo Effect
Describes the fact that some subjects respond favorable to any treatment, even an inactive one (placebo)
Matched Pairs
A common form of blocking for comparing just two treatments. In some matched pairs designs, each subject receives both treatments in a random order. In others, the subjects are matched in pairs as closely as possible, and each subject in a pair is randomly assigned to receive one of the treatments
Nonsampling Errors
The most serious errors in most careful surveys are non sampling errors. These have nothing to do with choosing a sample - they are present even in a census. Some common examples of non sampling errors are nonresponse, response bias, and errors due to question wording
Sampling Frame
The list from which a sample is actually chosen
Table of Random Digits
A long string of digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 with these properties:
• Each entry in the table is equally likely to be any of the 10 digits 0 through 9.
• The entries are independent of each other. That is, knowledge of one part of the table gives no information about any other part.
Random Sampling
The use of chance to select a sample; is the central principle of statistical sampling.
Anonymity
When the names of individuals participating in a study are not known even to the director of the study
Completely Randomized Design
When the treatments are assigned to all the experimental units completely by chance
Confidentiality
A basic principle of data ethics that requires individual data to be kept private
Inference About Cause and Effect
Using the results of an experimental to conclude that the treatments caused the difference in responses. Requires a well-designed experiment in which the treatments are randomly assigned to the experimental units
Inference About The Population
Using information from a sample to draw conclusions about the larger population. Requires that the individuals taking part in a study be randomly selected from the population of interest
Informed Consent
A basic principle of data ethics. Individuals must be informed in advance about the nature of a study and any risk of harm it may bring. Participating individuals must then consent in writing
Lack of Realism
When the treatments, the subjects, or the environment of an experiment are not realistic. Lack of realism can limit researchers’ ability to apply the conclusions of an experiment to the settings of greatest interest.
Margin of Error
A numerical estimate of how far the sample result is likely to be from the truth about the population due to sampling variability
Sampling Error
Mistakes made in the process of taking a sample that could lead to inaccurate information about the population. Bad sampling methods and undercoverage are common types of sampling error.
Sample Survey
A study that uses an organized plan to choose a sample that represents some specific population. We base conclusions about the population on data from the sample.
Strata
Groups of individuals in a population that are similar in some way that might affect their responses