Large Data Set Flashcards
advantage of a census
it should give a completley accurate result
disadvantages of a census
- time consuming
- cannot be used when the testing process destroys the item
- hard to process large quantity data
advantages of a sample
- less time consuming and expensive than a census
- fewer people have to respond
- less data to process than in a census
disadvantages of a sample
- the data may not be as accurate
- the sample may not be large enough to give information about small sub-groups of the population
define census
observers of measures every member of a population
define sample
a selection of observations taken from a subset
what are sampling units
individual units of a population
what is a sampling frame
often sampling units of a population are individually named or numbered to form a list
what are the three methods of random sampling
1.simple random
2. systematic
3. stratified
what is simple random sampling
where every sample of size n has an equal chance of being selected
what is systematic random sampling
the required elements are chosen at regular intervals from an ordered list
what is stratified sampling
the population is divided into mutually exclusive strata (males and females, for example) and a random sample is taken from each
advantages of simple random sampling
- free of bias
- easy and cheap to implement for small populations and small samples
- each sampling unit has a known and equal chance of selection
disadvantages of simple random sampling
- not suitable when the population size or the sample size is large as it is potentially time consuming, disruptive and expensive
advantages of systematic sampling
- simple and quick to use
- suitable for large samples and large populations
disadvantages of systematic sampling
- a sampling frame is needed
- it can introduce bias if the sampling is not random
advantages of stratified sampling
- sample accurately reflects the population and structure
- guarantees proportional representation of groups with a population
disadvantages of stratified sampling
- population must be clearly classified into distinct strata
- selection within each stratum suffers from the same disadvantages as simple random sampling
2 examples of non-random sampling
- quota
- opportunity
what is quota sampling
an interviewer or researcher selects a sample that reflects the characteristics of the whole population
advantages of quota sampling
- allows a small sample to still be representative of the population
- no sampling frame required
- quick, easy and inexpensive
- allows for easy comparison between different groups within a population
disadvantages of quota sampling
- non-random sampling can introduce a bias
- population must be divided into groups, which can be costly or inaccurate
- increasing scope of study increases number of groups, which adds time and expense
- non-responses are recorded as such
what is opportunity sampling
- consits of taking the sample from people who are available at the time the study is carried out and who fit the criteria you are looking for
advantages of opportunity sampling
- easy to carry out
- inexpensive
disadvantages of opportunity sampling
- unlikely to provide a representative sample
- highly dependednt of an individual researcher
what is a discrete variable
- can take only specific values
LDS: what is the warmest place in the UK
heathrow
LDS: what are the driest places in the UK
- heathrow
- hurn
LDS: what are the order of places in the UK from the north to south
- Leuchars
- leeming
- heathrow
- hurn
- camborne
LDS: what is the wettest coldest and windiest place in the UK
Leuchars
LDS: why are there loads of missing values in 1987
there was a great storm in the UK in 1987
LDS: what was the windiest month in the UK
may 2015
LDS: rainfall of Perth
- 0 OR
- very high
LDS: describe features of jacksonville
- hot and humid
- hurricanes (octobers of
LDS: features of Beijing
- inland
- less wind
LDS: what does tr mean
trace data = 0 to 0.05mm
LDS: how to you clean data
- delete n/a
- removing anomalies
- converting thr tr values to 0
LDS: example of a qualitative piece of data
daily mean wind speed - beaufort scale
- fresh, light, moderate & strong
LDS: example of a discrete variable
cloud coverage
- 0 to 8 oktas
what is an outlier
unusual data
what is an anomaly
error
LDS: what is daily mean temp measured in
degrees C
LDS: what is daily total rainfall measured in
mm
LDS: what is daily mean windspeed measured in
beaufort scale
LDS: what is wind gust measured in
knots
- 1kn =1.15 mph
LDS: what is wind direction
bearing
LDS: what is daily max rel. humidity measured in
%
- fog is when >95%
LDS: what is daily mean total cloud measured in
oktahs (eights)
- 0 to 8
LDS: what is daily mean pressure
1hPA = 100Pa
LDS: what is daily mean visibility measured in
Dm
- decametres