Unit 4 Flashcards
Population
The entire group of individuals we want information about
Census
Collects data from every individual in the population
Sample
A subset of individuals in the population from which we actually collect data
Convenience sample
Choosing individuals from the population who are easy to reach
-Often produces unrepresentative data
Bias
The design of a statistical study shows bias if it would consistently underestimate or consistently overestimate
-Convience and voluntary sampling
Voluntary response sample
Consits of people who choose themselves by repsonding to a general invitation
- Usually representitive of some large, population of interest
- Attracts people who feel strongly about an opinion and often share the same opinion
Random sampling
Involves using a chance process to determine which members of a populaiton are included in the samples
Simple Random sample
SRS of size n is chosen in such a way that every group of n individuals (combinations) in a population has an equal chance of being selected as the sample
-GIves each member and combo of members equal chance of being included/selected
Choosing a SRS with tech
Step 1: label. Give each individual in the population a distinct numerical label from 1 to N (population size)
Step 2: Randomize. Use a random number generator to obtain n (sample size) different intergerns from 1 to N
Choosing an SRS with Table D
Step 1: Label. Give each member of the population a numerical label with the same number of digigts. Try to use as few digits as possible
Step 2: Randomize. Read consecutive groups of digits of the appropriate length from L to R across a line in table d. Don’t use any digits outside populaiton size and don’t use repeats. Stop when you have n individuals
-All labels of the same length have the same chance to be chosen
Stratified random sample
- To get a stratified random sample, start by classifiying the population into groups of similar individuals, called strata. Then choose a separate SRS i each stratum and combine these SRSs to form the sample
- Works best when the indivudals in each stratum are similar with respect to what is being measured and when there are large differences between strata
- SImilar within and dif between. Gives a more precise estimate than simple random samples of the same size
- less variability/deviation in the stratified graph
- Want each stratum to contain similar individuals and for there to be a large difference between strata
Cluster sample
- To get one,s tart by classifuing the population into groups of individuals that are located near each other called clusters. Then choose an SRS of the clusters. ALl individuals in the chosen clusters are included in the sample
- Sometimes used to save money and time
- Sometimes people take and SRS of the cluster rather and survey all of the cluster
- Don’t offer the statistical advantage of better inforation about the population like stratified samples do
- Want each cluster to look like the population, but on a smaller size
Multistage sampling
Combines stratified and cluster sampling
Inference
- Infer about the population from what we know about the sample
- Inference from convience or voluntary samples would be misleading bc method of sampling is biased
Random sampling
- Rely on it to avoid bias in choosing a sample
- Unlikley that the results will be the same as the entire population
- Properly designed samples avoid systemic bias, but their results are rarely correct and we expect hem to vary from sample to sample