Sampling, Data Presentation and Interpretation Flashcards

1
Q

What is the Population?

A

The population is the whole group of subjects for statistical analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does it mean if a population is finite?

A

Finite populations are countable (in practice not just in theory.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does it mean if a population is infinite?

A

If a population is infinite it means it is not countable in practice.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a census?

A

A census is when you collect information from every member of a population. It’s easier to carry out if the population is fairly small and easily accessible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are advantages of a census?

A

It’s an accurate representation of the population because every member has been surveyed - It’s unbiased.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are disadvantages of a census?

A
  • If a population is large it can take a lot of time, effort and money to carry out.
  • It can be difficult to make sure all members are surveyed. If some are missed, the survey may be biased.
  • If the tested items are used up or damaged in some way by doing a census, a census is impractical.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a sample?

A

A sample is a selected group from a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are advantages of sampling?

A
  • It’s quicker and cheaper than a census and it’s easier to get a hold of all the required information.
  • It’s the only option when surveyed items are used up or damaged.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are disadvantages of sampling?

A
  • Each possible sample will give a different result, so you could just happen to select one which doesn’t accurately reflect the population.
  • Samples can easily be affected by sampling bias.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a representative sample?

A

A sample that is similar to the population in a way that gives similar results. If a sample is not representative it may be biased.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you avoid sampling bias?

A
  • Select from the correct population and make sure no member of the population is excluded.
  • Select your sample at random - if members are linked in some way it can cause bias.
  • Make sure all your sample members respond.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is simple random sampling?

A

Every member of the population has an equal chance of being selected for the sample and each selection is independent of the others.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you select a simple random sample?

A
  • Give a number to each population member from a full list of the population.
  • Generate a list of random numbers and match them to the numbered members to select your sample.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are advantages of simple random sampling?

A

-Every member of the population has an equal chance of being selected so it’s completely unbiased.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are disadvantages of simple random sampling?

A

-It can be inconvenient if the population is spread over a large area - it might be difficult to track down the selected members for example in a nation wide sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is systematic sampling?

A

Systematic sampling selects every nth member from the population you’re investigating.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How do you select a systematic sample?

A
  • Number each member of the population from a full list.
  • Calculate a regular interval to use by dividing the population by the sample size.
  • Generate a random starting point to choose the first member of your sample (this must be between 1 and your regular interval).
  • Keep adding the interval to the starting point to select your sample.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are advantages of systematic sampling?

A
  • It can be used for quality control on a production line - a machine can be set up to sample every nth item.
  • It should give an unbiased sample.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are disadvantages of systematic sampling?

A

If the interval coincides with a pattern in the population, the sample could be biased.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is stratified sampling?

A

If a population is divided into categories you can use the same proportion of each category in the sample as there is in the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How do you select a stratified sample?

A
  • Divide the population into categories.
  • Calculate the number needed for each category in the sample using the formula: (size of category in pop./total size of pop.)*total sample size.
  • Randomly select the sample for each category.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are advantages of stratified sampling?

A
  • If the categories are disjoint (there is no overlap), this should give a representative sample.
  • It’s useful when results may vary depending on categories.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are disadvantages of stratified sampling?

A

-The extra detail needed can make it expensive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is quota sampling?

A

An interviewer is given a quota of people in each category to interview (eg 20 men and 20 women). They choose people to interview until the quotas are fulfilled.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

How do you select a quota sample?

A
  • Divide a population into categories.
  • Give each category a quota (number of members to sample).
  • Collect data until the quotas are met in all categories (without using random sampling).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What are advantages of quota sampling?

A
  • It is easy for the interviewer as they don’t need access to the whole population or a list of every member.
  • The interviewer continues to sample until all quotas are met so non-response is less of a problem.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What are disadvantages of quota sampling?

A

It can be biased by the interviewer - selection isn’t random so they might exclude some of the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is opportunity sampling?

A

Opportunity (or convenience) sampling is where the sample is chosen from a section of the population that is most convenient for the sampler.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

How do you select an opportunity sample?

A

-Choose members of the population that are easiest to sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What are advantages of opportunity sampling?

A

Data can be gathered very quickly and easily.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What are disadvantages of opportunity sampling?

A

It isn’t random and can be very biased - there is no attempt to make the sample representative.

32
Q

What is cluster sampling?

A

Divide the population into groups that will be expected to give similar results and select some of the clusters for the sample.

33
Q

How do you select a cluster sample?

A
  • Divide the population into clusters covering the whole population, where no member of the population belongs to multiple clusters.
  • Randomly select clusters to use in the sample based on the required sample size.
  • Either use all the members of the selected clusters (a one stage cluster sample) or randomly sample within each cluster (two stage cluster sampling).
34
Q

What are advantages of cluster sampling?

A
  • It can be quicker and cheaper in certain situations.

- You can incorporate other sampling methods, making it quite adaptable.

35
Q

What are disadvantages of cluster sampling?

A
  • Because you only sample certain clusters the results could be less representative.
  • It’s not always possible to separate a population into clusters in a natural way.
36
Q

What is self-selection sampling?

A

Self-selection (or volunteer) sampling is where people choose to be part of the sample.

37
Q

How do you select a self-selection sample?

A
  • Advertise or appeal to the whole population for participation in the sample (possibly offering payment).
  • Either use everyone who responds as the sample or take a sample of them to best represent the population.
38
Q

What are advantages of self-selection sampling?

A
  • It requires little time or effort in finding sample members, as they contact you.
  • People who have volunteered are less likely not to respond.
  • It could be the only way to get people to take part in a study, or to find members of a population.
39
Q

What are disadvantages of self-selection sampling?

A

There can easily be trends within the respondents such as people having strong opinions which would lead to bias.

40
Q

What makes up data?

A

Observations/measurements, each recording a value of a particular variable.

41
Q

What is a qualitative variable?

A

A variable that takes non-numerical values (e.g. names, colours).

42
Q

What is a quantitative variable?

A

A variable that takes numerical values (e.g. height, age).

43
Q

What is a discrete quantitative variable?

A

A variable that can only take certain values - there are ‘gaps’ between possible values (e.g. you can’t take a shoe size of 9.664)

44
Q

What is a continuous quantitative variable?

A

A variable that can take any value within a particular range (e.g. height or mass)

45
Q

What is an upper class boundary?

A

The largest data value that would be included in that class. (If the data is continuous the upper class boundary of a class will be the same as the lower class boundary of the next class.)

46
Q

What is a lower class boundary?

A

The lowest value that would be included in that class.

47
Q

How do you find the class width of a class?

A

upper class boundary - lower class boundary

48
Q

How do you find the midpoint of a class?

A

(lower class boundary + upper class boundary)/2

49
Q

How do you plot a frequency polygon?

A

Take the midpoint of each class. Plot the midpoints on the x-axis and the corresponding frequency on the y-axis

50
Q

How do you find the frequency density?

A

frequency/class width

51
Q

How do you plot a histogram?

A

Plot the values of the variable on the x-axis and the frequency density on the y-axis. Plot boxes the width of the class width and height of the frequency density.

52
Q

How do you find frequency on a histogram?

A

The area of a box shows the frequency for that class width.

53
Q

How do you find the proportion of data values in a class from a histogram?

A

area of class/total area of all bars

54
Q

How do you draw a stem and leaf diagram?

A

Find the lowest common digit and use this and the digits above as the stem. Use the leftover digits for the leaves. Always write a key.

55
Q

What does it mean if data is symmetrical?

A

The data is symmetrical about the mean and the median which are roughly equal.

56
Q

What does it mean if data is positively skewed?

A

The data is concentrated in the lower part of the range (on the left).

57
Q

What does it mean if data is negatively skewed?

A

The data is concentrated in the upper part of the range (on the right).

58
Q

What does it mean if data is unimodal?

A

The data has one point where the distribution ‘peaks’.

59
Q

What does it mean if data is bimodal?

A

The data has two points where the distribution ‘peaks’.

60
Q

What is a measure of location (or of central tendency)?

A

A value that summarises where the ‘centre’ of the data lies.

61
Q

What is the formula for the mean?

A

The sum of all values/The number of values

62
Q

What is the formula for the mean of frequency data?

A

(The sum of (frequency * data value))/The total frequency

63
Q

What is the formula for the combined mean of two data sets?

A

((size1 * mean1) + (size2 * mean2))/(size1 + size2)

64
Q

What is the median?

A

The value in the middle of the data set when all the data values are placed in order of size.

65
Q

How do you find the median?

A

Put the data values in order, then:

  • If (number of values)/2 is not a whole number round it up to find the position of the median.
  • If (number of values)/2 is a whole number the median is halfway between the value in this position and the next value.
66
Q

What is the mode?

A

Most frequently occurring data value.

67
Q

How do you find the modal class of grouped data?

A

The modal class is the class with the highest frequency density. If all the classes are the same width, then this will just be the class with the highest frequency.

68
Q

How do you estimate the mean of a grouped data set?

A

(The sum of (frequency * midpoint of class))/The total frequency

69
Q

How do you estimate the median of a grouped data set?

A

Find which class the median is in, using the value of n/2, then use linear interpolation.

70
Q

How do you do linear interpolation?

A

Assuming the values are evenly spread, a1/b1 = a2/b2 where a is the distance into the class and b is the class width. Write out this equation where a1 will be m - the lower class boundary, and work out m.

71
Q

What are the qualities of the mean?

A
  • It is a good average as it uses all of the data.
  • It can be heavily affected by extreme values/outliers and by skew.
  • It can only be used with quantitative data.
72
Q

What are the qualities of the median?

A
  • The median is not affected by extreme values or by data that is skewed.
  • Estimating the median for grouped data requires a lot of work.
73
Q

What are the qualities of the mode?

A
  • It can be used with qualitative data.

- Data can have more than one mode, and if every value occurs only once then there is no mode.

74
Q

What is dispersion?

A

The measure of how spread out the data values are. The simplest measure of dispersion is range.

75
Q

What is the equation for range?

A

Range = highest value - lowest value