Large data set + sampling Flashcards
5 Uk weather stations
(N to S)
1. Leuchars
2. Leeming
3. Heathrow
4. Hurn
5. Camborne
3 international weather stations
Northern Hemisphere:
(W to E)
1. Jacksonville
2. Beijing
Southern Hemisphere:
- Perth
When was the data collected?
- May-Oct 1987
- May-Oct 2015
Variables
- total rainfall (mm)
- mean temp (oC)
- total sunshine (hrs to nearest 0.1)
- mean windspeed (kn)
- max gust (kn)
- humidity (%)
- mean visibility (m)
- mean pressure (hPa)
- wind direction
- mean cloud cover (oktas)
- 1kn = 1.15 mph// beaufort scale (0-5)
Mean wind speed- UK vs other countries
- UK ~9nm
- Beijing- 4nm
- Jacksonville- 5nm
- Perth- 8nm
Temp range
- large range in Beijing
- Jacksonville- highest min
- Perth ~UK
How to carry out simple random sampling
- allocate a no. between 1 &N to each person in sampling frame
- use random no. tables/ computer/ calculator to select n (context) diff. no. between 1 & N
- people corresponding to these no. become the sample
➕ of simple random sampling
- x bias
- easy & cheap
- each no. has a known equal chance of being selected
➖ of simple random sampling
- x suitable w/ large pop size
- sampling frame needed
How to carry out systematic sampling?
(required elements are chosen at regular intervals)
- randomly select a no. between 1 & k (eg. 001 & 500// 00 & 499)
- select every kth element
– k = pop size/ sample size (eg. k= 50000/100 = 500)
Population def
whole set of items of interest
Sample def
subset of pop intended to represent the pop
What is data collected from the entire pop?
census
+ & - of census
+ completely accurate result
- time consuming & expensive
- x used when testing involves destruction (eg. light bulbs)
- large vol of data to process
+ & - of census
+ completely accurate result
- time consuming & expensive
- x used when testing involves destruction (eg. light bulbs)
- large vol of data to process
+ & - of sample
+ cheaper
+ quicker
+ ↓ data to process
- data might x accurate
- data may x large enough to represent small sub-groups
How to carry out stratified sampling?
- pop divided into groups (strata)
- simple random sampling in each group
- same proportion from each strata
– sample size/ pop size - large sample
- naturally divided into groups
+ of stratified sampling
- reflects pop structure
- proportional representation of groups within pop
- of stratified sampling
- pop must be clearly classified into distinct strata
- selection within each stratum–> - of simple random sampling
+ of systematic sampling
- simple & quick
- √ large samples
꙾
- of systematic sampling
- √ sampling frame
- bias if s frame x random (eg. surname)
How to carry out quota sampling?
- pop divided into groups acc. to characteristics
- quota of items in each group–> reflects that group’s proportion in whole pop
+ of quota sampling
- small sample- still representative of pop
- x sampling frame
- quick, easy, cheap
- easy comparison betw. diff. groups
- of quota sampling
- non-random–> bias
- pop must be divided into groups–> costly/ inaccurate
- non-responses–> x recorded