Chapter 1 Flashcards
Population
The whole set of items that are of interest
Sample
A subset of the population intended to represent the population. Increasing sample sizes could improve estimates.
Advantages:
- Cheaper
- Quicker
- Less data to process
Disadvantages:
- Data may not be accurate
- Data may not be large enough to represent small sub-groups
Sampling unit
Each individual thing in the population that can be sampled.
Sampling frame
A list of sampling units of a population that have been individually named or numbered.
Census
Data collected from the entire population.
Advantages:
- Should give completely accurate result
Disadvantages:
- Time consuming
- Cannot be used when tested involves destruction
Large volume of data to process
Random sampling
When we want each sampling unit in our sampling frame to have an equal chance of being chosen in order to avoid bias.
- Simple random sampling
- Systematic sampling
- Stratified sampling
Non-random sampling
- Quota sampling
- Opportunity/convenience sampling
Simple random sampling
Every sampling unit in the sampling frame has an equal chance of being selected.
1. Allocate a number between 1 to N (number of people/things there are) to each thing.
2. Use a random number generator to select however many different numbers you need.
3. People/things corresponding to these numbers become the sample.
Advantages:
- Bias free
- Easy and cheap to implement
Each number has a known equal chance of being selected
Disadvantages:
- Not suitable when population size is large
- Sampling frame needed
Systematic sampling
Required elements are chosen at regular intervals in an ordered list. Take every k^th elements where: k = pop size/sample size starting at random item between 1 and k.
Advantages:
- Simple and quick to use
- Suitable for large samples/populations
Disadvantages:
- Sampling frame again needed
- Can introduce bias if sampling frame not random
Systematic sampling example
A telephone directory contains 50,000 names. A researcher wishes to select a systematic sample of 100 names from the directory. Explain in detail how the researcher should obtain such a sample.
Randomly select a number between 001 and 500. Select every 500th person.
Stratified sampling
Population divided into strata (distinct groups) and a simple random sample carried out in each group. Used when people is large and population naturally divides into groups.
Same proportion (sample size/population size) sampled from each strata.
Advantages:
Reflects population structure
- Guarantees proportional representation of groups within population
Disadvantages:
- Population must be clearly classified into distinct strata
- Selection within each stratum suffers same disadvantages as simple random sampling
Stratified sampling example:
A school has 15 classes and a 6th form. In each class there are 30 students. In the 6th form there are 150 students. There are equal numbers of boys and girls in each class and in the 6th form. The headteacher wishes to obtain the opinions of the students about school uniforms. Explain how the headteacher would take a stratified sample of size 40.
Total in school = (15x30) + 150 = 600.
random sample of 30/60 x 40 = 2 from each of the 15 classes.
random sample of 150/600 x 40 = 10 in 6th form.
Label the boys in each class from 1-15 and do the same for the girls. Use random numbers to select 1 girl and 1 boy.
Label the boys in the 6th form 1-75 and the same for the girls. Use random numbers to select 5 different boys and 5 different girls.
Quota sampling
Population divided into groups according to characteristic. A quota of items/people in each group is set to try and reflect the group’s proportion in the whole population. Interviewer selects the actual sampling units.
Advantages:
- Allows small sample to still be representative of population
- No sampling frame required
- Quick, easy, inexpensive
- Allows for easy comparison between different groups in population
Disadvantages:
- Non-random sampling can introduce bias
- Population must be divided into groups, which can be costly or inaccurate
- Increasing scope of study increases number of groups, adding time/expense
- Non-responses are not recorded
Quota sampling example
A lake contains 3 species of fish. There are estimated 1400 trout, 600 bass and 450 pike in the lake. A survey of the health of the fish in the lake is carried out and a sample of 30 fish is chosen. Explain how quota sampling could be used to select the sample of 30 fish.
Trout - 1400/2450 x 30 = 17.14
Bass - 600/2450 x 30 = 7.35
Pike - 450/2450 = 5.51
Fish are caught from the lake until the quota of 17 trout, 7 bass and 6 pike are reached. If a fish is caught and the species’ quota is full, then this is ignored.
Opportunity/convenience sampling.
Sample taken from people who are available at time of study, who meet criteria. Interviewer selects the actual sampling units according to the set criteria.
Advantages:
- Easy to carry out
- Inexpensive
Disadvantages:
- Unlikely to provide a representative sample
- Highly dependent on individual researcher
Types of data
- Qualitative/categorical: non-numerical values e.g. colour
- Quantitative: numerical values -> Discrete (can only take specific values e.g. shoe size) or Continuous (can take any decimal value)