Statistics - Data collection Flashcards
What is a population?
The whole set of items that are of interest
What is a census?
A survey that observes/measures every member of a population
What is a sample?
A selection of observations taken from a subset of a population which is used to find out info about the whole population
What is the advantage of a census?
Should give a completely accurate result
What are the disadvantages of a census?
Time consuming & expensive
Hard to process large quantities of data
Cannot be used when testing process destroys item
What are the advantages of taking a sample?
Less time consuming & expensive than a census
Fewer people have to respond
Less data to process
What are the disadvantages of taking a sample?
Data may not be accurate
Sample maybe not large enough to represent all small sub-groups in the population
What happens as the sample size increase?
The more accurate it is
More representative of the sample
What is the individual units of a population called?
Sampling units
What are done to sampling units in order to distinguish them?
They are individually named or numbered to form a list
What is a statistic?
A value taken from a single sample
What is a sampling frame?
A list of the sample units
What happens in random sampling?
Each member of the population has an equal chance of being selected
What are the advantages of using random sampling?
Representative of the population
Removes bias
What are the three methods of random sampling?
Simple random sampling
Systematic sampling
Stratified sampling
How can you perform simple random sampling?
Number each sampling unit
Random number generator or numbers put into a “hat” and chosen at random
What are the advantages of simple random sampling?
Free of bias
Easy and cheap for small populations/samples
Each sampling unit has a known/equal chance of selection
What are the disadvantages of simple random sampling?
Not suitable when population/sample size is large
Sampling frame is needed
What is systematic sampling?
Required elements are chosen at regular intervals from an ordered list
E.g Data taken every nth value
What are the advantages of systematic sampling?
Simple and quick to use
Suitable for large samples/populations
What are the disadvantages of systematic sampling?
Sampling frame is needed
Can introduce bias if sampling frame is not random
What is stratified sampling?
Population is divided into mutually exclusive strata, and a random sample is taken from each
Strata example - male & female
What rules should be followed for obtaining strata?
Proportion of each strata should be the same
What is the formula to calculate the number of people should be sampled from each strata?
Number sampled in strata = (number in strata / number in population) x overall sample size
What are the advantages of stratified sampling?
Sample accurately reflects the population structure
Guarantees proportional representation of groups within a population
What are the disadvantages of stratified sampling?
Population must be clearly classified into distinct strata
Selection within each stratum has disadvantages of simple random sampling
What are the two types of non-random sampling?
Quota sampling
Opportunity sampling
What is quota sampling?
An interviewer/researcher selects a sample that reflects the characteristics of the whole population
How is quota sampling done?
Population is divided into groups by an interviewer due to characteristics
Continues until quota is full
What happens with quota sample sizes?
Size of each group determines proportion of sample that should have that characteristic
Quotas will have a limit - if full the person’s data is dismissed
What are the advantages of quota sampling?
No sampling frame required
Quick, easy and inexpensive
Small sample still representative of population
Allows easy comparison between different groups
What are the disadvantages of quota sampling?
As non-random it introduces bias
Population division can be costly or inaccurate
Non-responses are not recorded
Increases groups so adds time and expensive
What is opportunity sampling?
Taking the sample from people who are available at the time the study is carried out and who fit the criteria you are looking for
What are the advantages of opportunity sampling?
Easy to carry out
Inexpensive
What are the disadvantages of opportunity sampling?
Highly dependent on individual researches
Unlikely to provide representative sample
What is opportunity sampling also known as?
Convenience sampling
What is quantitative variables/data?
Data/variables associated with numerical observations
What is qualitative variables/data?
Data/variables associated with non-numerical observations
What is a continuous variable?
A variable that can be given in any range
E.g 2 seconds, 2.3 s, 2.02 s
What is a discrete variable?
A variable that can only be specific values
E.g can’t have 2.65 people
What are the groups in grouped frequency tables called?
Classes
What face value data can be found using a grouped frequency table?
Class boundaries tell you max and min values in the class Midpoint is the average of the class boundaries Class width is the difference between higher and lower class boundaries
What large data sets will be provided?
Data about the weather, location, about different places around the world
What is the daily mean temperature?
°C
Average of hourly temp readings
What is the daily total rainfall?
Includes solid precipitation
Melted before being included in measurements
Less than 0.05mm recorded as “trace” or “tr”
What is daily total sunshine?
Recorded to nearest tenth of an hour
What is the daily mean wind direction and windspeed?
Knots, averaged over 24 hours
Directions given as bearings & compass directions
Mean windspeed also in Beaufort scale
What is a knot?
1 kn = 1.15 mph
What is the daily max gust?
Highest instantaneous wind speed recorded in knots
What is the daily max relative humidity?
% air saturation with water
Above 95% can be misty/foggy
What is daily mean cloud cover?
Measured in oktas - eighths of sky covered by cloud
Goes from 0-8
What is daily mean visibility?
Greatest distance an object can be seen in daylight
Measured in decameters (Dm)
What is daily mean pressure measured in?
Hectopascals (hPa)
What is a finite and infinite population?
Finite - can practically be counted
Infinite - cannot be counted practically
What is cluster sampling?
Divide population into clusters
Randomly select clusters based on sample size
Either use all in cluster or randomly sample
What are the advantages in cluster sampling?
More practical in some situations
Incorporate other methods into it
What are the disadvantages in cluster sampling?
Less representative as only some clusters sampled
Not always possible to separate into clusters in natural ways
What is self-selection/volunteer sampling?
People choose to be part of the study after advertisement to whole population
Either use all who respond or take sample of them
What are the advantages of self-selection sampling?
Little time or effort for sample members
Volunteers are less-likely to not respond
Could be only way to get people to take part
What are the disadvantages of self-selection sampling?
Trends could be present within the population that responds
What does n/a mean?
Not available
What is 0 on the Beaufort scale?
Calm
Less than 1 knot
What is 1-3 on the Beaufort scale?
Light
1-10 knots
What is 4 on the Beaufort scale?
Moderate
11-16 knots
What is 5 on the Beaufort scale?
Fresh
17-21 knots