Chapter 1: Data Collection Flashcards
Population
the whole set of items that are of interest
Census
observes or measures every member of a population
Sample
a selection of observations taken from a subset of the population which is used to find out information about the population as a whole
Sampling unit
individual units of a population that can be sampled
Sampling frame
list of sampling units that have been individually named or numbered
Census advantages
- Should give a completely, accurate result
Census disadvantages
- Time consuming + expensive
- Can’t be used when testing process destroys item
- Hard to process large quantity of data
Sample advantages
- Less time consuming and expensive than a census
- Fewer ppl. have to respond
- Less data to process than a census
Sample disadvantages
- Data may not be as accurate
- Sample may not be large enough to give information about small sub-groups of the population
What are the three types of random sampling?
- Simple random sampling
- Systematic sampling
- Stratified sampling
Describe how to carry out simple random sampling
Sampling frame is needed. Each thing allocated a unique number and a selection of these numbers are chosen at random (either by RNG or ‘lottery sampling’)
Advantages of simple random sampling
- Free of bias
- Easy + cheap to implement for small populations and small samples
- Each sampling unit has a known and equal chance of selection
Disadvantages of simple random sampling
- Not suitable when the population size/sample size is too large
- Sampling frame needed
Describe how to carry out systematic sampling
Required elements are chosen at regular intervals from an ordered list
e. g. sample size of 20 required from population of 100
- Take every fifth person since 100/20 = 5. Use a RNG to select a number between 1-5 for first person (e.g. if it was 2 - 2, 7, 12, 17…)
Advantages of systematic sampling
- Simple + quick to use
- Suitable for large samples and large populations
Disadvantages of systematic sampling
- Sampling frame needed
- Can introduce bias if sampling frame is not random
Describe how to carry out stratified sampling
Population divided into mutually exclusive strata and random sample (by RNG) taken from each. Proportion of each strata sampled should be same.
number sampled in a stratum = (number in stratum/number in population) x overall sample size
Advantages of stratified sampling
- Accurately reflects population structure
- Guarantees proportional representation of groups within a population
Disadvantages of stratified sampling
- Population must be clearly classified into distinct strata
- Selection within each stratum suffers from same disadvantages as simple random sampling
What are the two types of non-random sampling?
Quota sampling and opportunity/convenience sampling
Describe quota sampling
Divide the population into groups according to given characteristics (e.g. being left-handed) . The size of each group determines the proportion of the sample that should have that characteristic. The interviewer
should assess which group people fall into, as part of the interview. Once a quota has been filled,
no more people in that group are interviewed.
Advantages of quota sampling
- Allows a small sample to still be representative of the population
- No sampling frame required
- Quick, easy and inexpensive
- Easy comparison between different groups within a population
Disadvantages of quota sampling
- Non-random sampling can introduce bias
- Population must be divided into groups which can be costly or inaccurate
- Increasing scope of study increases number of groups, which adds time and expense
- Non-responses are not recorded
Describe opportunity/convenience sampling
Consists of taking the sample from people who are available at the time the study is carried out and who fit the criteria you are looking for (e.g. first 20 people you meet outside a supermarket on a Monday morning who are carrying shopping bags)
Advantages of opportunity/convenience sampling
- Easy to carry out
- Inexpensive
Disadvantages of opportunity/convenience sampling
- Unlikely to provide a representative sample
- Highly dependent on individual researcher
What is quantitative data?
Data associated with numerical observations (e.g. shoe size)
What is qualitative data?
Data associated with non-numerical observations (e.g. hair colour)
What is continuous data?
Data that can take any value in a given range (e.g. time (2s, 2.1s, 2.01s)
What is discrete data?
Data that can only take specific values (e.g. number of girls in a family (because you can’t have 2.65 girls in a family)
In a grouped frequency table, what do class boundaries tell you?
Maximum and minimum values that belong in each class
In a grouped frequency table, what is the midpoint?
Average of the class boundaries
In a grouped frequency table, what is the class width?
Difference between upper and lower class boundaries
What do you have to be careful about for continuous data in a grouped frequency table?
Data values have been rounded to nearest mm so (e.g. class boundaries would be 29.5 to 31.5 instead of 30-31)