Chapter 1 - Data Collection Flashcards

1
Q

What is a population?

A

The whole set of items that are of interest (the thing being surveyed is the population)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a census?

A

A census observes or measures every member of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a sample?

A

A sample is a selection of observations taken from a subset of the population, which is used to find out information about the population as a whole

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Census advantages:

A
  • Gives a completely accurately result
  • Representative of everyone (and smaller subgroups)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Census disadvantages:

A
  • Time consuming
  • Expensive
  • Cannot be used when the testing process destroys the item
  • Hard to process large quantities of data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Sample advantages:

A
  • Less time consuming
  • Less expensive
  • Fewer people have to respond
  • Less data to process
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Sample disadvantages:

A
  • The data may be less accurate
  • The sample may not be large enough to represent small subgroups
  • The results could be biased
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are individual units of a population known as?

A

Sampling units (i.e. a person in a larger survey is a sampling unit)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a sampling frame?

A

A list of individually numbered or named sampling units of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does the size of a sample depend on?

A

The required accuracy and available resources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How is the validity of a sample affected by its size?

A
  • Generally, the larger the sample, the more accurate it is (but you will require greater resources)
  • If the population is varied, you need a larger sample than if the population were uniform
  • Different samples can lead to different conclusions due to the natural variation in a population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why do we random sample?

A

We randomly sample because it means every member of the population has an equal chance of being selected. The sample should therefore be representative of the population. It also helps to remove bias from the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the 3 methods of random sampling?

A
  • Simple random sampling
  • Stratified sampling
  • Systematic sampling
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Simple random sampling:

A

To carry out a simple random sample, you need a sampling frame, usually a list of people or things, Each person or thing is allocated a unique number and a selection of these numbers is chosen at random. Selections can be made using random number generators or lottery style sampling (e.g. pulled from a hat)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Stratified sampling:

A

In stratified sampling, the population is divided into mutually exclusive strata (distinct subgroups of the population e.g. males and females) and a random sample is taken from each. The number selected from each stratum is reflective of the proportion of that stratum within the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Systematic sampling:

A

In systematic sampling, the required sampling units are selected from an ordered list, and made at regular, chosen intervals

The size of the interval depends upon the number in the population, as well as the number desired from the sample. Divide the population by the sample number and round down, then use this value as the difference between the ordered terms, after selecting a starting point less than this value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Advantages of simple random sampling:

A
  • Free of bias
  • Easy and cheap to implement for small populations and small samples
  • Each sampling unit has a known and equal chance of selection
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Disadvantages of simple random sampling:

A
  • Not suitable when the population size or the sample size is large as it is potentially time consuming, disruptive and expensive
  • A sampling frame is needed
  • Could exclude minorities
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Advantages of Systematic sampling:

A
  • Simple and quick to use
  • Suitable for large samples and large populations
  • Cheap and easy
20
Q

Disadvantages of Systematic sampling:

A
  • A sampling frame is needed
  • It can introduce bias if the sampling frame is not random
21
Q

Advantages of Stratified sampling:

A
  • Sample accurately reflects the population structure
  • Guarantees proportional representation of groups within a population
  • Small sample sizes required - saves time and money
22
Q

Disadvantages of Stratified sampling:

A
  • Population must be clearly classified into distinct strata
  • Selection within each stratum suffers from the same disadvantages as simple random sampling
  • The requirement of a sampling frame means it’s time consuming and expensive
23
Q

What are the 2 types of non-random sampling?

A
  • Quota Sampling
  • Opportunity Sampling
24
Q

Quota Sampling:

A

In quota sampling, the interviewer/researcher first determines the different characteristics of the populations that they wish to represent. These will be mutually exclusive in the same way that the strata for a stratified sample are

It is then determined how many people you wish to question from each group. (this can be determined in the same manner as a stratified sample)

As an interviewer, you would then meet members of the population, assess which strata they fall into, and then allocate them into the appropriate quota

Once you have met your quota for a group, you no longer include any further members into that group

You continue this process until your quota for each group is filled

25
Q

Opportunity sampling:

A

Opportunity sampling consists of taking the sample from people who are available at the time of the study, and who fit the relevant criteria

For example, if you wish to find out the purchasing habits of shoppers from a particular store, you may choose to question those who are leaving the store and have made a purchase

26
Q

Advantages of Quota sampling:

A

• Allows a small sample to still be representative of the population
• No sampling frame required
• Quick, easy and inexpensive
• Allows for easy comparison between different groups within a population

27
Q

Disadvantages of Quota sampling:

A

• Non-random sampling can introduce bias
• Population must be divided into groups, which can be costly or inaccurate
• Increasing scope of study increases number of groups, which adds time and expense
• Non-responses are not recorded - leads to bias

28
Q

Advantages of Opportunity sampling:

A
  • Easy
  • Cheap
  • Quick
  • No sampling frame required
29
Q

Disadvantages of Opportunity sampling:

A
  • Unlikely to provide a representative sample
  • Highly dependent on the individual researcher
30
Q
A
31
Q

What is quantitative data?

A

Variables or data associated with numerical observations are called quantitative variables or quantitative data

For example, you can give a number to shoe size so shoe size is a quantitative variable

32
Q

What is qualitative data?

A

Variables or data associated with non-numerical observations are called qualitative variables or qualitative data

For example, you can’t give a number to hair colour (blonde, red, brunette). Hair colour is a qualitative variable

33
Q

What is continuous variable?

A

A variable that can take any value in a given range is a continuous variable

For example, time can take any value, e.g. 2 seconds, 2.1 seconds, 2.01 seconds etc.

34
Q

What is a discrete variable?

A

A variable that can take only specific values in a given range is a discrete variable

For example, the number of girls in a family is a discrete variable as you can’t have 2.65 girls in a family

35
Q

Grouped frequency tables:

A

When data is presented in a grouped frequency table, the specific data values are not shown. The groups are more commonly known as classes

• Class boundaries tell you the maximum and minimum values that belong in each class
• The midpoint is the average of the class boundaries
• The class width is the difference between the upper and lower class boundaries

36
Q

LDS - Daily mean temperature:

A

Daily mean temperature in °C - this is the average of the hourly temperature readings during a 24-hour period

37
Q

LDS - Daily total rainfall:

A

Daily total rainfall including solid precipitation such as snow and hail, which is melted before being included in any measurements - amounts less than 0.05 mm are recorded as ‘tr’ or ‘trace’

‘tr’ means 0 in an exam question

38
Q

LDS - Daily total sunshine:

A

Daily total sunshine recorded to the nearest tenth of an hour

39
Q

LDS - Daily mean wind direction and windspeed:

A

Daily mean wind direction and windspeed in knots, averaged over 24 hours from midnight to midnight. Mean wind directions are given as bearings and as cardinal (compass) directions. The data for mean windspeed is also categorised according to the Beaufort scale

40
Q

LDS - Beaufort scale:

A
41
Q

LDS - Daily maximum gust:

A

Daily maximum gust in knots - this is the highest instantaneous windspeed recorded. The direction from which the maximum gust was blowing is also recorded

42
Q

LDS - Daily maximum relative humidity:

A

Daily maximum relative humidity, given as a percentage of air saturation with water vapour. Relative humidities above 95% give rise to misty and foggy conditions

43
Q

LDS - Daily mean cloud cover:

A

Daily mean cloud cover measured in ‘oktas’ or eighths of the sky covered by cloud

44
Q

LDS - Daily mean visibility:

A

Daily mean visibility measured in decametres (Dm). This is the greatest horizontal distance at which an object can be seen in daylight

45
Q

LDS - Daily mean pressure:

A

Daily mean pressure measured in hectopascals (hPa)

46
Q

How are missing values from the Large Data Set represented?

A

n/a or ‘not available’