Gathering Data Flashcards
Random
An outcome is random if we know the possible values it can have, but not which particular value it takes
Simulation
Models a real world simulation by using random-digit outcomes to mimic the uncertainty of a response variable of interest
Simulation component
A component uses equally likely random digits to model simple random occurrences whose outcomes may not be equally likely
Trial
The sequence of several components representing events that we are pretending will take place
Response variable
Values of the response variable record the results of each trial with respect to what we were interested in
Population
Entire group of individuals or instances about whom we hope to learn
Sample
A representative subset of a population, examined in hope of learning about the population
Sample survey
A study that asks questions of a sample drawn from some population in the hope of learning something about the entire population.
Ex. Polls taken to assess voter preferences are common sample surveys
Bias
Any systematic failure of a sampling method to represent its population is bias. Tends to over or underestimate parameters. It is almost impossible to recover from bias, so efforts to avoid it are well spent. Common errors include
- relying on voluntary response
- undercoverage of the population
- Nonresponse bias
- Response bias
Randomization
The best defense against bias is randomization, in which each individual is given a fair, random chance at selection.
Sample size
The number of individuals in a sample. The sample size determines how well the sample represents the population, not the fraction of the population sampled.
Census
A sample that consists of the entire population
Population parameter
A numerically valued attribute of a model for a population. We rarely expect to know the true value of a population parameter, but we do hope to estimate it from sampled data.
Ex. The mean income of all employed people in the country is a population parameter.
Representative
A sample is said to be representative if the statistics computed from it accurately reflect the corresponding population parameters.
Simple random sample
A simple random sample of size n is a sample in which each set of n elements in the population has an equal chance of selection
Sampling frame
A list of individuals from whom the sample is drawn is called the sampling frame. Individuals who may be in the population of interest, but who are not in the sampling frame, cannot be included in any sample.
Sampling variability
The natural tendency of randomly drawn samples to differ from each other. Sometimes called sampling error, sampling variability is not error, but just the natural result of random sampling.
Stratified random sample
A sampling design in which the population is divided into several sub populations, or strata, and random sample are then drawn from each stratum. If the strata are homogenous, but are different from each other, a stratified random sample may yield more consistent results than an SRS.