Statistics Flashcards

Question 1

Q

Three methods of Random Sampling

Answer

A

1- simple random sampling
2- systematic sampling
3- stratified sampling

Question 2

Q

State what a simple random sample is

Answer

A

A simple random sample of size ‘n’ is one where every individual sample has an equal chance of being selected.

e.g. group of people are allocated a number and a selection of these numbers are chosen at random

Question 3

Q

Two methods of choosing the unique numbers when simple random sampling

Answer

A

generating random numbers using a calculator, computer or random number table
lottery sampling names on IDENTICAL tickets drawn from a ‘hat’

Question 4

Q

State what systematic sampling is

Answer

A

The required elements are chosen at regular intervals from an ordered list.

e.g. if you needed a sample size of 20, and you had population of 100, you would take every 5th person in that population (100 / 20 = 5) ….. NOTE: the first person to be chosen should be chosen at RANDOM.
- - e.g. if 2nd person, then the next sampled people would be 7, 12, 17, etc…..

Question 5

Q

State what stratified sampling is

Answer

A

the population is divided into mutually exclusive strata (males and females, age range categories etc) and a random sample is taken from each.

Question 6

Q

What should you remember about each strata sampled in stratified sampling

Answer

A

The proportion of each strata sampled must be the same.

-e.g. if there are 150 in a population (100 males and 50 females) and 75 were required to be sampled, then there should be 50 males and 25 females in the sample

Question 7

Q

State the formula used to calculate the number of people we should sample from each stratum

Answer

A

number sampled in a stratum = (number in stratum / number in population ) x overall required sample size

Question 8

Q

Advantages of simple random sampling (3)

Answer

A

free of bias
easy and cheap to implement for small populations and small samples
each sampling unit has a known and equal chance of selection

Question 9

Q

Disadvantages of simple random sampling (2)

Answer

A

not suitable when the population size or sample size is too large
a sampling frame is needed

Question 10

Q

Advantages of systematic sampling (2)

Answer

A

simple and quick to use

- suitable for large samples and large populations

Question 11

Q

Disadvantages of systematic sampling (2)

Answer

A

a sampling frame is needed

- it can introduce bias if the sampling frame is not random

Question 12

Q

Advantages of stratified sampling (2)

Answer

A

sample accurately reflects the population structure

- guarantees proportional representation of certain groups within a population

Question 13

Q

Disadvantages of stratified sampling (2)

Answer

A

population must be clearly classified into distinct strata (strata meaning - groups/categories)
selection within each stratum suffers from the same disadvantages as simple random sampling (not suitable when population/sample is too large + sampling frame needed)

Question 14

Q

Two types of non-random sampling

Answer

A

quota sampling

- opportunity sampling

Question 15

Q

State what quota sampling is

Answer

A

When an interviewer or researcher selects a sample that reflects the characteristics of the whole population.

Question 16

Q

How quota sampling works

Answer

A

Population divided into groups according to a given characteristic.

The size of each group determines the proportion of the sample that should have that characteristic.

Question 17

Q

State what opportunity sampling is

Answer

A

It consists of taking the sample from people who are available at the time the study is carried out and who fit the criteria you’re looking for.

-e.g. first 20 people you meet outside a supermarket on a Monday morning who are carrying shopping bags

Question 18

Q

Advantages of quota sampling (4)

Answer

A

allows a small sample to still be representative of the population
no sampling frame needed
quick, easy, inexpensive
allows for easy comparison between different groups within a population

Question 19

Q

Disadvantages of quota sampling (4)

Answer

A

non-random sampling can introduce bias
population must be divided into groups, which can be costly or inaccurate
increasing scope of study increases number of groups, which adds time and expense
non-responses are not recorded as such

Question 20

Q

Advantages of opportunity sampling (2)

Answer

A

easy to carry out

- inexpensive

Question 21

Q

Disadvantages of opportunity sampling (2)

Answer

A

unlikely to provide a representative sample

- highly dependent on individual researcher

Question 22

Q

What are data/variables with numerical observations called?

Answer

A

QUANTITATIVE data/variables

-e.g. shoe size are in numbers

Question 23

Q

What are data/variables with non-numerical observations called?

Answer

A

QUALITATIVE data/variables

-e.g. hair colour, you can’t give a number to each colour

Question 24

Q

Give an example of a continuous variable

Answer

A

(any from)

height
weight
time

Question 25

Q

Give an example of a discrete variable

Answer

A

(any from)

number of people
number given when a dice is rolled

Question 26

Q

Continuous variables can …

Answer

A

… take ANY value in a given range

Question 27

Q

Discrete variables can …

Answer

A

… take ONLY SPECIFIC values in a given range

Question 28

Q

Mode is …

Answer

A

the value that occurs most often.

Question 29

Q

Median is …

Answer

A

the middle value when the values are all put in order.

Question 30

Q

Equation for median is…

Answer

A

(n + 1)/2 = x

- ‘x’ being the ‘x’th value when the data set is put in order

Question 31

Q

Mean is …

Answer

A

the “average of the data”

Question 32

Q

Mean can be calculated using the formula:

Answer

A

x̄ = Σx / n

Where:

‘x̄’ is called ‘x bar’. This represents the mean
Σx is the sum of all of the data values
n is the number of data values

Question 33

Q

Mean in a, frequency table, can be calculated using the formula:

Answer

A

x̄ = Σ(xf) / Σf

Where:

‘x̄’ is called ‘x bar’. This represents the mean
Σ(xf) is the sum of the products of the data values (‘x’) and their frequencies (‘f’)
—— e.g. (x * f) + (x * f) + (x * f) + ….. = Σ(xf)
Σf is the sum of the frequencies

Question 34

Q

Is the median of a set of data effected by extreme values?

Answer

A

No, as the extreme values are not taken into account when calculating the median from a set of data

Question 35

Q

Is the mean of a set of data effected by extreme values?

Answer

A

Yes, as it takes into account each value from the whole data set when calculating the mean of a set of data.

Question 36

Q

Is mode useful if in a set of data, each value only occurs once?

Answer

A

No, you need at least one value which occurs more times otherwise there is no value that stands out.

Question 37

Q

What is it called when a set of data has two modes?

Question 38

Q

What value of x would you use, when given a frequency table with class intervals (e.g. 30 - 31, 32 - 33, …etc.)?

Answer

A

You would take the midpoint of the class interval (for this e.g. 30.5, 32.5, …etc.).

Question 39

Q

When the mean is calculated from a frequency table, is it always going to be completely accurate?

Answer

A

No, it will be an estimate.

As you’re using the midpoint of the class intervals. The true values could be any where/any one which is within that given range.

—— E.g. if interval is 30-31 in mm, midpoint is 30.5mm which you use to calculate the mean. However, potentially all the values could be 30.1mm but you cannot tell this from a frequency table. Therefore it is an estimate.

Question 40

Q

Formula used to calculate the LOWER quartile

Answer

A

L.Q = n/4

It will be the (n/4)th value when the data is put in increasing order.

Question 41

Q

Formula used to calculate the UPPER quartile

Answer

A

U.Q = 3n/4

It will be the (3/4 of n)th value when the data is put in increasing order.

Question 42

Q

What is a percentile?

Answer

A

It is when the set of data is divided up into 100 parts.

E.g. the 10th percentile lies one-tenth of the way through the data set.

Question 43

Q

Interpolation is when you …

Answer

A

… assume that the data values are evenly distributed within each class. (Go to page 26 of Stats+Mechanics Y1 book for clear example of how to interpolate)

Question 44

Q

How to calculate range from a set of data?

Answer

A

LARGEST - smallest = range

Question 45

Q

How to calculate interquartile range from a set of data?

Answer

A

UPPER quartile - lower quartile = IQR

Question 46

Q

What is the interpercentile range?

Answer

A

(First given percentile) - (second given percentile) = interpercentile range

Question 47

Q

Upper quartile is represented as …

Question 48

Q

Lower quartile is represented as …

Question 49

Q

Median is represented as …

Question 50

Q

Variance is …

Answer

A

… is the average (squared) distance from the mean.

Question 51

Q

Why is the variance squared?

Answer

A

To eliminate all negative values of deviation (if it is below the mean)

Question 52

Q

Standard deviation is …

Answer

A

… is a measure of the amount of variation of a set of values.

Basically, it is how widespread the data is.

Question 53

Q

Formula for Variance

Answer

A

σ² = (Σx^2 / n) - (Σx / n)^2

Question 54

Q

Formula for Standard Deviation

Answer

A

σ = √ (σ²)
… which is just square rooting variance so…
σ = √ [ (Σx^2 / n) - (Σx / n)^2 ]

Question 55

Q

Relationship between standard deviation and variance

Answer

A

σ² = σ
… so to find standard deviation when you’ve got a value for variance, all you need to do is SQUARE ROOT it!

Where:

σ² is the variance and;
σ is the standard deviation

Question 56

Q

Formula for Variance (in a frequency table)

Answer

A

σ² = (Σf(x^2) / Σf) - (Σfx / Σf)^2

Question 57

Q

Formula for Standard Deviation (in a frequency table)

Answer

A

σ = √ (σ²)
… which is just square rooting the formula for variance so…
σ = √[ (Σf(x^2) / Σf) - (Σfx / Σf)^2 ]