Statistics Flashcards

1
Q

Three methods of Random Sampling

A

1- simple random sampling
2- systematic sampling
3- stratified sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

State what a simple random sample is

A

A simple random sample of size ‘n’ is one where every individual sample has an equal chance of being selected.

  • e.g. group of people are allocated a number and a selection of these numbers are chosen at random
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Two methods of choosing the unique numbers when simple random sampling

A
  • generating random numbers using a calculator, computer or random number table
  • lottery sampling names on IDENTICAL tickets drawn from a ‘hat’
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

State what systematic sampling is

A

The required elements are chosen at regular intervals from an ordered list.

  • e.g. if you needed a sample size of 20, and you had population of 100, you would take every 5th person in that population (100 / 20 = 5) ….. NOTE: the first person to be chosen should be chosen at RANDOM.
      • e.g. if 2nd person, then the next sampled people would be 7, 12, 17, etc…..
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

State what stratified sampling is

A

the population is divided into mutually exclusive strata (males and females, age range categories etc) and a random sample is taken from each.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What should you remember about each strata sampled in stratified sampling

A

The proportion of each strata sampled must be the same.

-e.g. if there are 150 in a population (100 males and 50 females) and 75 were required to be sampled, then there should be 50 males and 25 females in the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

State the formula used to calculate the number of people we should sample from each stratum

A

number sampled in a stratum = (number in stratum / number in population ) x overall required sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Advantages of simple random sampling (3)

A
  • free of bias
  • easy and cheap to implement for small populations and small samples
  • each sampling unit has a known and equal chance of selection
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Disadvantages of simple random sampling (2)

A
  • not suitable when the population size or sample size is too large
  • a sampling frame is needed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Advantages of systematic sampling (2)

A
  • simple and quick to use

- suitable for large samples and large populations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Disadvantages of systematic sampling (2)

A
  • a sampling frame is needed

- it can introduce bias if the sampling frame is not random

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Advantages of stratified sampling (2)

A
  • sample accurately reflects the population structure

- guarantees proportional representation of certain groups within a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Disadvantages of stratified sampling (2)

A
  • population must be clearly classified into distinct strata (strata meaning - groups/categories)
  • selection within each stratum suffers from the same disadvantages as simple random sampling (not suitable when population/sample is too large + sampling frame needed)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Two types of non-random sampling

A
  • quota sampling

- opportunity sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

State what quota sampling is

A

When an interviewer or researcher selects a sample that reflects the characteristics of the whole population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How quota sampling works

A

Population divided into groups according to a given characteristic.

The size of each group determines the proportion of the sample that should have that characteristic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

State what opportunity sampling is

A

It consists of taking the sample from people who are available at the time the study is carried out and who fit the criteria you’re looking for.

-e.g. first 20 people you meet outside a supermarket on a Monday morning who are carrying shopping bags

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Advantages of quota sampling (4)

A
  • allows a small sample to still be representative of the population
  • no sampling frame needed
  • quick, easy, inexpensive
  • allows for easy comparison between different groups within a population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Disadvantages of quota sampling (4)

A
  • non-random sampling can introduce bias
  • population must be divided into groups, which can be costly or inaccurate
  • increasing scope of study increases number of groups, which adds time and expense
  • non-responses are not recorded as such
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Advantages of opportunity sampling (2)

A
  • easy to carry out

- inexpensive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Disadvantages of opportunity sampling (2)

A
  • unlikely to provide a representative sample

- highly dependent on individual researcher

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are data/variables with numerical observations called?

A

QUANTITATIVE data/variables

-e.g. shoe size are in numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are data/variables with non-numerical observations called?

A

QUALITATIVE data/variables

-e.g. hair colour, you can’t give a number to each colour

24
Q

Give an example of a continuous variable

A

(any from)

  • height
  • weight
  • time
25
Q

Give an example of a discrete variable

A

(any from)

  • number of people
  • number given when a dice is rolled
26
Q

Continuous variables can …

A

… take ANY value in a given range

27
Q

Discrete variables can …

A

… take ONLY SPECIFIC values in a given range

28
Q

Mode is …

A

the value that occurs most often.

29
Q

Median is …

A

the middle value when the values are all put in order.

30
Q

Equation for median is…

A

(n + 1)/2 = x

- ‘x’ being the ‘x’th value when the data set is put in order

31
Q

Mean is …

A

the “average of the data”

32
Q

Mean can be calculated using the formula:

A

x̄ = Σx / n

Where:

  • ‘x̄’ is called ‘x bar’. This represents the mean
  • Σx is the sum of all of the data values
  • n is the number of data values
33
Q

Mean in a, frequency table, can be calculated using the formula:

A

x̄ = Σ(xf) / Σf

Where:

  • ‘x̄’ is called ‘x bar’. This represents the mean
  • Σ(xf) is the sum of the products of the data values (‘x’) and their frequencies (‘f’)
  • —— e.g. (x * f) + (x * f) + (x * f) + ….. = Σ(xf)
  • Σf is the sum of the frequencies
34
Q

Is the median of a set of data effected by extreme values?

A

No, as the extreme values are not taken into account when calculating the median from a set of data

35
Q

Is the mean of a set of data effected by extreme values?

A

Yes, as it takes into account each value from the whole data set when calculating the mean of a set of data.

36
Q

Is mode useful if in a set of data, each value only occurs once?

A

No, you need at least one value which occurs more times otherwise there is no value that stands out.

37
Q

What is it called when a set of data has two modes?

A

Bimodal

38
Q

What value of x would you use, when given a frequency table with class intervals (e.g. 30 - 31, 32 - 33, …etc.)?

A

You would take the midpoint of the class interval (for this e.g. 30.5, 32.5, …etc.).

39
Q

When the mean is calculated from a frequency table, is it always going to be completely accurate?

A

No, it will be an estimate.

  • As you’re using the midpoint of the class intervals. The true values could be any where/any one which is within that given range.

—— E.g. if interval is 30-31 in mm, midpoint is 30.5mm which you use to calculate the mean. However, potentially all the values could be 30.1mm but you cannot tell this from a frequency table. Therefore it is an estimate.

40
Q

Formula used to calculate the LOWER quartile

A

L.Q = n/4

It will be the (n/4)th value when the data is put in increasing order.

41
Q

Formula used to calculate the UPPER quartile

A

U.Q = 3n/4

It will be the (3/4 of n)th value when the data is put in increasing order.

42
Q

What is a percentile?

A

It is when the set of data is divided up into 100 parts.

E.g. the 10th percentile lies one-tenth of the way through the data set.

43
Q

Interpolation is when you …

A

… assume that the data values are evenly distributed within each class. (Go to page 26 of Stats+Mechanics Y1 book for clear example of how to interpolate)

44
Q

How to calculate range from a set of data?

A

LARGEST - smallest = range

45
Q

How to calculate interquartile range from a set of data?

A

UPPER quartile - lower quartile = IQR

46
Q

What is the interpercentile range?

A

(First given percentile) - (second given percentile) = interpercentile range

47
Q

Upper quartile is represented as …

A

Q_3

48
Q

Lower quartile is represented as …

A

Q_1

49
Q

Median is represented as …

A

Q_2

50
Q

Variance is …

A

… is the average (squared) distance from the mean.

51
Q

Why is the variance squared?

A

To eliminate all negative values of deviation (if it is below the mean)

52
Q

Standard deviation is …

A

… is a measure of the amount of variation of a set of values.

Basically, it is how widespread the data is.

53
Q

Formula for Variance

A

σ² = (Σx^2 / n) - (Σx / n)^2

54
Q

Formula for Standard Deviation

A

σ = √ (σ²)
… which is just square rooting variance so…
σ = √ [ (Σx^2 / n) - (Σx / n)^2 ]

55
Q

Relationship between standard deviation and variance

A

σ² = σ
… so to find standard deviation when you’ve got a value for variance, all you need to do is SQUARE ROOT it!

Where:

  • σ² is the variance and;
  • σ is the standard deviation
56
Q

Formula for Variance (in a frequency table)

A

σ² = (Σf(x^2) / Σf) - (Σfx / Σf)^2

57
Q

Formula for Standard Deviation (in a frequency table)

A

σ = √ (σ²)
… which is just square rooting the formula for variance so…
σ = √[ (Σf(x^2) / Σf) - (Σfx / Σf)^2 ]