Topic 8 - Sample Surveys Flashcards
L.O.
LO6 Use the box model to describe chance and chance variability, including sample surveys and the Central Limit Theorem
Limitations of a census
Census:
- collect data from the whole population
Limitations:
- Time consuming
- Resources
- Cost
- DIfficult to ensure participation
Parameters and estimates
Parameter:
- numerical fact about population, usually what we want to know
Estimate:
- A calculation of sample values best predicts the parameter
Types of bias
Selection bias:
- exclude/ include one type of person
Non-Response bias:
- People who dont want to participate
Interview bias:
- Interviewer has to make a choice of participants in the survey
Measurement bias:
- When the form of questions in survey affects answers
Recall bias:
- People forget things
Social desireability bias:
- People lie
Lack of clarity:
- misinterpret question
Bias and sample size relationship
Larger sample sizes may amplify the bias, not reduce
Types of samples
Simple random sampling:
- Random individuals from population taken without replacement
- Equally random chance
Multi-stage Cluster sampling:
- Probability sampling technique, taking samples in stages and simple random sampling in each stage
Convenience Sampling:
- Not good
- non-probability sampling technique
- Subjects selected due to convenience
- NOT representative
Quota sampling:
- Not good
- non-probability sampling technique
- Assessbled sample has the same proportions as individuals as the entire population with respect to known characteristics
eg. 1500 males, 1000 females
20% from each = 300 males, 200 females
Modelling a sample survey
Can use a survey as a box model
Box = population/ true parameter
Every sample = New survey
Chance errors in Sample Surveys
Always contain chance error from population
Box model can be used to quantify size of chance error
What affects accuracy?
Standard Error (SE) will be smaller in larger samples of the population
SE = SD/ root(n)
Surveys drawn without replacement
Correction factor
- Adjusts the SE from the box model to get exact SE
- Involves both sample size and population sample
As the population gets very BIG compared to the sample size, the correction factor is almost 1
[heft]
Confidence interval
How precisely does the survey measure the population
- Use sample to find an estimate
68%, 95%, 99.7%
Simulating confidence intervals
[heft]
Boot strapping
- The computational process that allows us to estimate the properties of the population, by using the properties of a particular sample
- Useful when there is no information about the population or where the assumptions about the population are questionable
Sample from population, then resample from original sample, then using means etc. create a histogram
[heft]
Potential problem: Normal assumption
- If sample is large enough, the central limit theorum may apply
- Box model may grow past sensible results (110%)
- Instead of using a box model, bootstrapping can be sued
Example of Boot strapping procedure
- Assume original sample best represents population
- Take a large amount of resamples of size n from the original sample with replacement
- For each resample, calculate the proportions of 1’s, so there are n simulated proportions
- Plot n proportions to investigate the true distribution of proportions
Box Model Vs Boot strapping
Boot strap doesn’t exceed 1 as proportion cannot be over 1
Box Model = For population
Bootstrap = For sample
Spreads are similar