Gathering Data Flashcards

Question 1

Q

Random

Answer

A

An outcome is random if we know the possible values it can have, but not which particular value it takes

Question 2

Q

Simulation

Answer

A

Models a real world simulation by using random-digit outcomes to mimic the uncertainty of a response variable of interest

Question 3

Q

Simulation component

Answer

A

A component uses equally likely random digits to model simple random occurrences whose outcomes may not be equally likely

Question 4

Q

Trial

Answer

A

The sequence of several components representing events that we are pretending will take place

Question 5

Q

Response variable

Answer

A

Values of the response variable record the results of each trial with respect to what we were interested in

Question 6

Q

Population

Answer

A

Entire group of individuals or instances about whom we hope to learn

Question 7

Q

Sample

Answer

A

A representative subset of a population, examined in hope of learning about the population

Question 8

Q

Sample survey

Answer

A

A study that asks questions of a sample drawn from some population in the hope of learning something about the entire population.

Ex. Polls taken to assess voter preferences are common sample surveys

Question 9

Q

Bias

Answer

A

Any systematic failure of a sampling method to represent its population is bias. Tends to over or underestimate parameters. It is almost impossible to recover from bias, so efforts to avoid it are well spent. Common errors include

relying on voluntary response
undercoverage of the population
Nonresponse bias
Response bias

Question 10

Q

Randomization

Answer

A

The best defense against bias is randomization, in which each individual is given a fair, random chance at selection.

Question 11

Q

Sample size

Answer

A

The number of individuals in a sample. The sample size determines how well the sample represents the population, not the fraction of the population sampled.

Question 12

Q

Census

Answer

A

A sample that consists of the entire population

Question 13

Q

Population parameter

Answer

A

A numerically valued attribute of a model for a population. We rarely expect to know the true value of a population parameter, but we do hope to estimate it from sampled data.

Ex. The mean income of all employed people in the country is a population parameter.

Question 14

Q

Representative

Answer

A

A sample is said to be representative if the statistics computed from it accurately reflect the corresponding population parameters.

Question 15

Q

Simple random sample

Answer

A

A simple random sample of size n is a sample in which each set of n elements in the population has an equal chance of selection

Question 16

Q

Sampling frame

Answer

A

A list of individuals from whom the sample is drawn is called the sampling frame. Individuals who may be in the population of interest, but who are not in the sampling frame, cannot be included in any sample.

Question 17

Q

Sampling variability

Answer

A

The natural tendency of randomly drawn samples to differ from each other. Sometimes called sampling error, sampling variability is not error, but just the natural result of random sampling.

Question 18

Q

Stratified random sample

Answer

A

A sampling design in which the population is divided into several sub populations, or strata, and random sample are then drawn from each stratum. If the strata are homogenous, but are different from each other, a stratified random sample may yield more consistent results than an SRS.

Question 19

Q

Cluster sample

Answer

A

A sampling design in which entire groups, or clusters, are chosen at random. Cluster sampling is usually selected as a matter of convenience, practicality, or cost. Clusters are heterogeneous, and a random sample of clusters should be representative of the population.

Question 20

Q

Multistage sample

Answer

A

Sampling schemes that combine several sampling methods are called multistage samples. For example, a national polling service may stratify the country by geographical regions, select a random sample of cities from each region, and then interview a cluster of residents in each city.

Question 21

Q

Systematic sample

Answer

A

A sample drawn by selecting individuals systematically from a sampling frame. When there is no relationship between the order of the sampling frame and the variables of interest, a systematic sample can be representative.

Question 22

Q

Pilot study

Answer

A

A small trial run of a survey to check whether questions are clear. A pilot study can reduce errors due to ambiguous questions.

Question 23

Q

Voluntary response bias

Answer

A

Bias introduced to a sample when individuals can choose on their own whether to participate in the sample. Samples based on voluntary response are always invalid and cannot be recovered, no matter how large the sample size.

Question 24

Q

Convenience sample

Answer

A

A convenience sample consists of the individuals who are conveniently available. Convenience samples often fail to be representative because every individual in the population is not equally convenient to sample.

Question 25

Q

Undercoverage

Answer

A

A sampling scheme that biases the sample in a way that gives a part of the population less representation than it has in the population suffers from undercoverage.

Question 26

Q

Nonresponse bias

Answer

A

Bias introduced when a large fraction of those sampled fails to respond. Those who do respond are likely to not represent the entire population. Voluntary response bias is a form of nonresponse bias, but nonresponse may occur for other reasons. For example, those who are at work during the day won’t respond to a telephone survey conducted only during working hours.

Question 27

Q

Response bias

Answer

A

Anything in a survey design that influences responses falls under the heading of response bias. One typical response bias, arises from the wording of questions, which may suggest a favored response.

Ex. Voters are more likely to express support of “the president” than support of the particular person holding the office at that moment.

Question 28

Q

Prospective study

Answer

A

An observational study in which subjects are followed to observe future outcomes. Because no treatments are deliberately applied, a prospective study is not an experiment. Nevertheless, prospective studies focus on estimating differences among groups that might appear as the groups are followed during the course of the study.

Question 29

Q

Experiment

Answer

A

An experiment manipulates factor levels to create treatments, randomly assign subjects to these treatment levels, and then compares the responses of the subject groups across treatment levels.

Question 30

Q

Random assignment

Answer

A

To be valid, an experiment must assign experimental units to treatment groups at random.

Question 31

Q

Factor

Answer

A

A variable whose values are manipulated by the experimenter. Experiments attempt to discover the effects that differences in factor levels may have on the responses of the experimental units.

Question 32

Q

Response

Answer

A

A variable whose values are compared across different treatments. In a randomized experiment, large response differences can be attributed to the effect of differences in treatment level.

Question 33

Q

Experimental units

Answer

A

Individuals on whom an experiment is performed. Usually called subjects or participants when they are human.

Question 34

Q

Level

Answer

A

The specific values that the experimenter chooses for a factor are called the levels of the factor.

Question 35

Q

Treatment

Answer

A

The process, intervention, or other controlled circumstance applied to randomly assigned experimental units. Treatments are the different levels of a single factor or are made up of combinations of levels of two or more factors.

Question 36

Q

Principles of experimental design

Answer

A

Control aspects of the experiment that we know may have an effect on the response, but that are not the factors being studied.
Randomize subjects to treatments to even out effects that we cannot control.
Replicate over as many subjects as possible. Results for a single subject are just anecdotes. If, as often happens, the subjects of the experiment are not a representative sample from the population of interest, replicate the entire study with a different group of subjects, preferably from a different part of the population.
Block to reduce the effects of identifiable attributes of the subjects that cannot be controlled.

Question 37

Q

Completely randomized design

Answer

A

In a completely randomized design, all experimental units have an equal chance of receiving any treatment.

Question 38

Q

Statistically significant

Answer

A

When an observed difference is too large for us to believe that it is likely to have occurred naturally, we consider the difference to be statistically significant. Subsequent chapters will show specific calculations and give rules, but the principle remains the same.

Question 39

Q

Control group

Answer

A

The experimental units assigned to a baseline treatment level, typically either the default treatment, which is well understood, or a null, placebo treatment. Their responses provide a basis for comparison.

Question 40

Q

Blinding

Answer

A

Any individual associated with an experiment who is not aware of how subjects have been allocated to treatment groups is said to be blinded.

Question 41

Q

Single blind and double blind

Answer

A

There are two main classes of individuals who can effect the outcome of an experiment:
-those who could influence the results (subjects, treatment administrators, technicians)
- those who evaluate the results (judges, treating physicians)
When every individual in either of these classes is blinded, an experiment is said to be single blind. When everyone in both classes is blinded, we call the experiment double blind.

Question 42

Q

Placebo

Answer

A

A treatment known to have no effect, administered to one group so that all groups experience the same conditions. Many subjects respond to such a treatment (a response called the placebo effect). Only by comparing with a placebo can we be sure that the observed effect of a treatment is not due simply to the placebo effect.

Question 43

Q

Placebo effect

Answer

A

The tendency of many human subjects (often 20% or more of experiment subjects) to show a response even when administered a placebo.

Question 44

Q

Blocking

Answer

A

When groups of experimental units are similar, it is often a good idea to gather them together into blocks. By blocking, we isolate the variability attributable to the differences between the blocks so that we can see the differences caused by the treatments more clearly.

Question 45

Q

Randomized block design

Answer

A

Subjects are randomly assigned to treatments within blocks.

Question 46

Q

Matching

Answer

A

In a retrospective or prospective study, subjects who are similar in ways not under study may be matched and then compared with each other on the variables of interest. Matching, like blocking, reduces unwanted variation.

Question 47

Q

Confounding

Answer

A

When the levels of one factor are associated with the levels of another factor in such a way that their effects cannot be separated, we say that these two factors are confounded.