Statistics 1.1 - Data collection Flashcards

1
Q

What is a population?

A

The whole set of items that are of interest.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a census?

A

A census is a form of data collection that observes or measures every member of a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a sample?

A

A selection of observations taken from a subset of the population which is used to find out information about the population as a whole.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is/are the advantage/s of a census?

A

It should give a completely accurate result.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is/are the advantage/s of a sample?

A

It is less time consuming and expensive than a census, fewer people have to respond and as such there is less data to process than in a census.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is/are the disadvantage/s of a census?

A

It is time-consuming and expensive, it cannot be used when the testing process destroys the item and it is hard to process such a large quantity of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What effects does sample size have on a sample?

A

The size of the sample depends on the required accuracy and available resources. Generally, the larger the sample, the more accurate it is, but the higher the resource requirement. If the population is very varied, a larger sample is needed that a uniform population to account for the variation. As such different samples can lead to different conclusions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a sampling unit?

A

A sampling unit is an individual unit of a population, they are often named or numbered to form a list known as a sampling frame.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Describe random sampling.

A

In random sampling, every member of the population has an equal chance of being selected. The sample should therefore be representative of the population. Random sampling also helps to remove bias from a sample.
There are three main methods of random sampling:
Simple random sampling
Systematic sampling
Stratified sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe simple random sampling.

A

A simple random sample of size n is one where every sample of size n has an equal chance of being selected.
To carry out a simple random sample, a sampling frame is needed, usually a list of people or things. Each person or thing is allocated a unique number and a selection of these numbers is chosen at random.
There are two methods of choosing the numbers: generating random numbers (using of a calculator, computer or random number table) or lottery sampling.
In lottery sampling, the members of the sampling frame could be written on tickets and placed into a ‘hat’. The required number of tickets would then be drawn out.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe systematic sampling.

A

In systematic sampling, the required elements are chosen at regular intervals from an ordered list.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe stratified sampling.

A

In stratified sampling, the population is divided into mutually exclusive strata and a random sample is taken of each.
The proportion each strata sampled should be the same. A simple formula can be used to calculate the number of units that should be sampled from each stratum;
The number stratified in a stratum=(number in stratum/number in population)×overall sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is/are the advantage/s of simple random sampling?

A

It is free of bias, easy and cheap to implement for small population and small samples and each sampling unit has a known and equal chance of selection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is/are the disadvantage/s of simple random sampling?

A

It isn’t suitable when the population size or the sample size is large. It also requires a sampling frame.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is/are the advantage/s of systematic sampling?

A

It is simple and quick to use and suitable for large samples and large populations. It also garuntees proportional representation of groups within a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is/are the disadvantage/s of systematic sampling?

A

It requires a sampling frane and can introduce bias if the sampling frame is not itself random.

17
Q

What is/are the advantage/s of stratified sampling?

A

The sample will accurately reflect the population structure and garuntees proportional representation of the groups within a population.

18
Q

What is/are the disadvantage/s of stratified sampling?

A

The population has to be clearly classified into distinct strata, also the selection within each stratum suffers from the same disadvantages as simple random sampling.

19
Q

Describe non-random sampling.

A

There are two main types of non-random sampling:
Quota sampling
Opportunity sampling

20
Q

Describe quota sampling.

A

In quota sampling, an interviewer or researcher selects a sample that reflects the characteristics of the whole population.
The population is divided into groups according to a given characteristic. The size of each group determines the proportion of the sample that should have that characteristic.
Interviewing/testing continues until all quotas have been filled. If a person refuses to be interviewed or the quota into which they fit is full, then they are simply ignored.

21
Q

Describe opportunity sampling.

A

Opportunity sampling consists of taking the sample from people who are available at the time the study is carried out and who fit the criteria you are looking for. Opportunity sampling is sometimes called convenience sampling.

22
Q

What is/are the advantage/s of quota sampling?

A

It allows a small sample to still be representative of the population. It doesn’t require a sampling fire, it’s quick, easy and inexpensive. It also allows for easy comparison between different groups between different groups within a population.

23
Q

What is/are the disadvantage/s of quota sampling?

A

Non-random sampling can introduce bias. The population must be divided into groups which can be costly or inaccurate. Increasing the scope of the study increases the number of groups, which adds time and expensive. Non-responses aren’t recorded as such.

24
Q

What is/are the advantage/s of opportunity sampling?

A

It is both easy to carry out and inexpensive.

25
Q

What is/are the disadvantage/s of opportunity sampling?

A

It is unlikely to provided a representative sample and it is highly dependent on the individual researcher carrying out the research.

26
Q

What is quantitative variables and data?

A

Variables or data that are associated with numerical observations. For example, a number can be give to shoe size, so shoe size is a quantitative variable.

27
Q

What is qualitative variables and data?

A

Variables or data that are associated with non-numerical observations. For example, a number cannot be give to hair colour, so hair colour is a qualitative variable.

28
Q

What is a continuous variable?

A

A variable that can take any value in a given range. For example, time is continuous since it can take any value.

29
Q

What is a discrete variable?

A

A variable that can only take specific values in a given range. For example, the number of children in a family is discrete, since it can only take whole numbers, as you can’t have 0.5 children.

30
Q

Describe how data can be displayed in a grouped frequency table or as grouped data.

A
When data is presented in a grouped frequency table, the specific data values are not shown. The groups are more commonly known as classes.
Class boundaries tell you the maximum and minimum values that belong to each class
The midpoint is the average of the class boundaries.
The class width is the difference between the upper and lower class boundaries.
31
Q

What is the large data set?

A

The large data set consists of weather data samples provided by the Met Office for five UK weather stations and three overseas weather stations over two periods of time: May to October 1987 and May to October 2015.

32
Q

What and where are the eight weather stations that provide data for the large data set?

A
Leuchars - Scotland
Leeming - Northern England
Heathrow - Southwest England
Hurn - South England
Camborne - Southwest England
Jacksonville - Florida, USA
Beijing - China
Perth - Australia
33
Q

What are the variables measured in the large data set?

A

Daily mean temperature (measured in °C)
Daily total rainfall (measured in mm- any non-zero amounts less than 0.05mm are measured as ‘tr’ or ‘trace’)
Daily total sunshine (recorded to the nearest tenth of an hour)
Daily mean wind direction and windspeed (measured in knots and also categorised according to the Beaufort scale)
Daily maximum gust (measured in knots)
Daily macimum relative humidity (given as a percentage of air saturation with water vapour, relative humidities above 95% give rise to misty and foggy conditions)
Daily mean cloud cover (measured in oktas, or eighths of the sky covered by cloud)
Daily mean visibility (measured in decameters or Dm)
Daily mean pressure (measured in hectopascals or hPa)