applied statistics - topic 1 - data collection Flashcards
What is a population?
school scenario example
the whole set of items of interest
i.e sixth form students
What is a sample?
(school scenario example)
What are the pros and cons?
some subset of the population intended to represent the population
(i.e a class)
✔ cheaper
✔ quicker
✔ less data to process
✘ data may not be accurate
✘ data may not be large enough to represent small sub-groups
What is a sampling unit?
school scenario example
each individual item/thing in the population that can be sampled
(i.e a student)
What is a sampling frame?
school scenario example
each sampling unit of a population, individually named or numbered to form a list (i.e a class register)
What is a census?
What are the pros and cons?
when data is collected from the entire population
✔ completely accurate result
✘ time consuming and expensive
✘ can’t be used when testing involves destruction
✘ large volumes of data to process
Simple Random Sampling
- how to carry out?
- pros?
- cons?
In sampling frame, each item assigned a number and a random number generator or ‘lottery sampling’ is used.
✔ bias free
✔ easy and cheap to implement
✔ each sample has an equal chance to be selected
✘ not suitable when population size is large
✘ sampling frame needed
✘can be biased if sampling frame not random
Systematic Random Sampling
- how to carry out?
- pros?
- cons?
elements chosen at regular intervals in an ordered list (k=popSample/sampSize –> start at a random number between 1 and k)
✔ simple and easy to use
✔ suitable for large samples
✘ need sampling frame
✘ can be bias if sampling frame not random
Stratified Random Sampling
- how to carry out?
- pros?
- cons?
population divided into groups (strata) and a simple random sample carried out on each group, the same proportion (strataSize/popSize x wantedSampleSize) taken from each strata
✔ reflects population structure
✔ guarantees proportional representation of groups within population
✔ good for when sample is large
✘ population must clearly divide into distinct strata
✘ selection in each stratum suffers from same disadvantages as simple random sampling
Quota Sampling
- how to carry out?
- pros?
- cons?
population divided into groups according to characteristics, a quota of items/people in each group is set to try and reflect the groups proportion of the whole population, interviewer selects the actual sample unit
✔ allows small sample to still be representative of population
✔ no sampling frame required
✔ relatively easy and inexpensive
✔ allows for easy comparison between different groups in population
✘ non-random sampling can introduce bias
✘ population must be divided into groups, costly or inaccurate
✘ increasing scope of study increases number of groups, adding time/expense
✘ non-responses aren’t recorded
Opportunity/Convenience Sampling
- how to carry out?
- pros?
- cons?
sample taken from people who are available at time of study and who meet the criteria
✔ easy to carry out
✔ inexpensive
✘ unlikely to provide representative sample
✘ highly dependent on individual researcher
What is qualitative data?
non-numerical values (e.g colour)
What is quantitative data?
numerical values
What is discrete data?
can only take specific values (e.g shoe size)
What is continuous data?
can take any decimal value (possible with a specific range)