Large Data Set/Sampling Methods Flashcards
Locations included in large data set?
Leuchars: town in Scotland
Leeming: village in North Yorkshire
Heathrow: hamlet in Greater London
Hurn: village in Dorest (South West England)
Camborne: town in Cornwall (South West England)
Beijing: capital city of China
Perth: capital city of Western Australia
Jacksonville: city in Florida
Units/time of measurement of daily mean temp?
Celsius
0900-0900
Units/time of measurement of daily total rainfall ?
Millimetres - 1dp
0900-0900
‘Tr’ is amount less than 0.05mm
Units/time of measurement of daily total sunshine?
Hours - 1dp
From midnight
Units/time of measurement of daily max relative humidity?
%
Reading above 95% = mist/fog
Units/time of measurement of daily mean wind speed and direction?
Knots (1Kn = 1.15mph) - nearest integer
Can be described using Beaufort conversion - qualitative
Direction measured in degrees - rounded to nearest 10
Averaged for 24 hrs from 0000
Units of daily max gust and direction? What is it?
Knots
- max instantaneous speed over 24 hrs
Units/time of measurement of cloud cover?
Measured in Oktas - eighths of the sky covered
- discrete qualitative data
Units/time of measurement of daily mean visibility?
Decametres (1Dm = 10m )
Units/time of measurement of daily mean pressure?
Hectopascals (1hPa = 100 Pa = 1 millibar)
What do international cities have data for?
Daily mean temperature
daily total rainfall
daily mean pressure
daily mean windspeed
What data is missing for the UK?
The total daily total sunshine, mean windspeed and maximum gust is unknown for the first half of May 1987 for the UK cities
What locations are on coast ?
Jacksonville
Perth - southern hemisphere so have winter when its our summer
Camborne
Hurn
Leuchars
When did great storm happen and what is impact of this?
15-16 Oct 1987
- south east England affected - skew some variables (wind, gust, rainfall) not sunshine/cloud cover
What is a census? AD/DIS
Observe and measures every member of population
AD: gets accurate results
DIS : time consuming/expensive
Hard to process large amount of data
What is a sample?AD/DIS?
Subset of population who is used to collect data from \
AD: quicker /cheaper than census
- less data to process
DIS : not as accurate
- not representative /introduce bias
Sampling frame?
list of all members of the population
AD/DIS simple random sampling?
AD: free of bias
Easy and cheap
DIS: not suitable for large sample size
Sampling frame needed
AD/DIS systematic sampling?
AD: Simple and quick to use
- good for larger samples
DIS : sampling frame needed
- introduce bias
AD/DIS stratified sampling?
AD: reflects population structure
Guarantees proportional representation of groups within population
DIS : population need to be put into strata
- sampling within strata has same dis as simple random sampling (sampling frame needed)
Quota sampling AD/DIS?
AD: no sampling frame needed
Quick ,easy, inexpensive
Allow small sample to represent population
DIS : non random sampling introduce bias
- pop must be divided into groups - costly/inaccurate
Opportunity sampling AD/DIS?
AD: easy to carry out
Inexpensive
DIS : not representative
Dependent on researcher
How to carry out systematic sampling?
Choose members of population at regular intervals
- to find size of interval :
Size of population (N) / size of sample (n)
- choose starting point randomly and use regular interval from there
How to carry out stratified sample?
Population divided into strata /random sample taken in each strata
No. Members sampled in strata : (size of sample (n) / size of population (N) ) x sample size
How to carry out quota sampling?
Population split into groups /members selected until each quota is filled
- size of group proportional to amount of ppl in pop that should have characteristic