large data set Flashcards
what are the 5 uk locations
leuchars - Scottish cost
leeming - North Yorkshire
heathrow - Greater London
hurn - south west
Camborne - cornwall
what are the 3 international locations
beijing Perth Jacksonville
what time periods is the large data set measured
may to October 1987
may to October 2015
what are the large data set variables
daily mean air temp
daily total rainfall
daily total sunshine
daily maximum relative humidity
daily mean windspeed and direction
daily maximum gust and direction
cloud cover
daily mean visibility
daily mean pressure
daily mean air temp
celcius, between 9am and 9pm
daily total rainfall
mm, for the 24 hours starting 9am, tr is less than 0.05mm
daily total sunshine
hours
daily maximum relative humidity
%, above 95% is mist/fog
daily mean windspeed and direction
knots, described using Beaufort conversion (calm, light etc) , direction is given as cardinal (north south east west)
daily maximum gust and direction
maximum instantaneous speed over 24hrs
cloud cover
okras ( 1/8’s of sky covered)
daily mean visibility
decametres horizontally
daily mean pressure
hectopascals
what is unknown for first half of may in 1987 for uk cities
daily total sunshine, mean windspeed and max gust
what do the international cities contain data for
mean temp, rainfall, pressure, windspeed
what are near a coast
Jacksonville, Perth, cambrone, hurn, leuchars
what is in south hemisphere
Perth
what variable is discrete
cloud cover
what should you replace tr with
0 or 0.025
what happened 15-16 October 1987
great storm
high wind speeds
south England affected
can skew wind/gust/rainfall
how many days does lds cover
184
what is a census
collects data about all members of a population
advantage of census
fully accurate results
disadvantage of census
time consuming, expensive
what is sampling
collecting data from a subset of the population
what is simple random sampling
every group within population has an equal probability of being selected for the sample
uniquely number every member and randomly select numbers using a random number generator
what is systematic sampling
choose members of a population at regular intervals using a list
choose every kth member where k= (size of population)/(size of sample)
what is stratified sampling
population divided into groups and random sample from each group
% taken from each group reflects that groups prevalence in population
what is quota sampling
population split into groups and members of population chosen until quota selected (not random)
what is opportunity sampling
sample formed using available members at time of study who fit criteria
pros of systematic sampling
simple, quick, suitable for large samples/populations
disadvantages for systematic sampling
sampling frame needed, bias introduced if frame not random
pros for stratified sampling
accurately reflects population structure
disadvantages for stratified sampling
population must be clearly classified into groups (strata), same as simple random within group
pros of simple random sampling
free of bias, cheap
disadvantages of simple random sampling
not suitable for large samples, sampling frame needed
pros of quota sampling
allows small sample to represent population, np sampling frame, easy, quick, cheap
disadvantages of quota sampling
non random so can introduce bias, population must be divided into groups, non responses not recorded
pros of convenience sampling
easy, inexpensive
is the binomial distribution continuous or discrete
discrete
is the normal distribution continuous or discrete
continuous
convert p(x=a) in discrete distribution to continuous
p( a-0.5 < X < a + 0.5)
convert P(x<a) in a discrete distribution to continuous
P(X< a - 0.5)
convert P(X>a) in discrete to continuous
P(X>a+0.5)
convert P( X <= a) in discrete to continuous
p( X <= a + 0.5)
covert p(x>= a) in discrete to continuous
p(x>= a - 0.5)
if X is coded with y = ax + B, what is the mean, SD
( mean of y) - B all divided by a MEAN
SD sd y divided by a