Statistics Flashcards

1
Q

What is a population

A

The whole set of items that are of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a census

A

Measures every member of a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a sample

A

A selection of observations taken from a subset of the population to find out information about the population as a whole

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the advantages of a census

A

Represents total population

Provides all relevant data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the advantages of sampling

A

Quicker
Easier
Cheaper

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the disadvantages of a census

A

Time consuming
Difficult
Expensive
May be impossible to get everyone

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the disadvantages of sampling

A

May be incomplete or may not be representative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How does convenience/ opportunity sampling work

A

Taking a sample of people who are available at the time and fit the criteria

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a random sample without replacement called

A

Simple random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a random sample with replacement called?

A

Unrestricted random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does stratified random sampling work (basically)?

A

The population is divided into strata, random samples are taken from each strata in proportion to the size of each strata

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is quota sampling?

A

Similar to stratified but sample is not random

Population is divided into groups with a given characteristic and the size of the groups determines the proportion of the sample that should have that characteristic. The most convenient people with that characteristic are chosen until the quota is filled

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What must a sampling method be for it to be random?

A

Each unit must have an equal chance of being chosen

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Is systematic sampling random

Why

A

No

It is impossible for consecutive names in the sampling frame to both be in the same sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you take a systematic sample

A

Work out the ‘skip size’ by dividing total population by the desired size of the sample, rounding the nearest integer
Use a RNG to select starting point which will be the first sampling unit
Add ‘skip size’ to this number and continue. Taking the members of the population who correspond with the numbers generated
This continues until sample size has been obtained

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Give the strengths and weaknesses of random sampling

A

Strengths:
Free of bias
Cheap/easy for small samples
Each sampling unit has equal chance of being chosen

Weaknesses:
Not suitable for larger populations
Sampling frame needed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Strengths and weaknesses of stratified sampling

A

Strengths:
Accurately reflects structure of population
Guarantees proportional representation of groups within a population
Weaknesses:
Population must be clearly classified into distinct strata
Random selection within strata suffers same disadvantages as random sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Strengths and weaknesses of quota sampling

A

Advantages :
Allows small sample to be representative
No sampling frame needed
Quick/easy/cheap
Allows comparison between different groups

Disadvantages:
Can be biased
Division of population can be costly & inaccurate
Increasing scope of study increases number of groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Disadvantages and advantages of systematic sampling

A

Advantages:
Simple/ quick
Suitable for large populations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Advantages and disadvantages of opportunity sampling

A

Inexpensive
Easy
Quick

Disadvantages:
Unlikely to be representative
Highly dependant on individual researcher

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is qualitative data

A

Non numerical eg colour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the different kinds of quantitative data

A

Discrete- only takes specific values eg shoe size, number of people (NB can still be infinite)

Continuous - can take any decimal value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

3 measures of centre

A

Mean, median, mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the mean

A

The sum of the data divided by the number of values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Define median

A

The middle value when data is ordered from smallest to largest
If there are an even number of values, the median is halfway between the two central values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Define mode

A

Most common value

There can be one mode, two modes (bi-modal) or no mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Advantages and disadvantages of mean

A

Advantages:
Includes all data

Disadvantages:
Susceptible to outliers
When data is grouped it is an estimate of the mean

28
Q

Advantages and disadvantages of of median

A

Advantages:
Less sensitive to outliers

Disadvantages:
Positional only
Grouped data requires interpolation

29
Q

Strengths and weaknesses of mode

A

Strengths:
Can be used for qualitative

Weaknesses:
Only relevant if there are high frequencies
Can be misleading
Doesn’t consider the numerical value of the data

30
Q

Name the types of measures of spread

A

Standard deviation
Interquartile range
Range

31
Q

Formulas for standard deviation

A

Square root {[sum(x-u)squared] divided by n}

Or

Root [(sigma x squared over n) minus x bar squared]

32
Q

IQR method

A

Upper quartile - lower quartile

33
Q

Positives and negatives of standard deviation

A

Advantages:
Includes all data

Disadvantages:
Susceptible to outliers

34
Q

Advantages and disadvantages of IQR

A

Advantages
Less sensitive to outliers

Disadvantages
Positional only and 50% is arbitrary
Grouped data requires interpolation

35
Q

Disadvantages of range

A

Highly susceptible to outliers

36
Q

What is variance

A

Standard deviation squared

37
Q

How does adding/ subtracting affect the mean

A

Increases/ decreases by that amount

38
Q

How does multiplying/dividing affect mean

A

Multiplied/ divided by that factor

39
Q

How does Adding/ subtracting on standard deviation

A

No effect

40
Q

How does multiplying/dividing affect the standard deviation

A

Multiplied/ divided by that factor

41
Q

What are the first 13 square numbers

A

1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169

42
Q

What is r for PMCC

A

Sample PMCC

43
Q

What is rho

A

Population PMCC

44
Q

When comparing median/ means what should you reference

A

Compare size with reference to actual values and context
Larger value suggests larger sample
% difference should be calculated if >2 marks

45
Q

If mean/ median and IQR/ standard deviation are close what does this suggest

A

Samples from the same population

46
Q

Define population

A

All the data of a given group

47
Q

Define Sample

A

A selection of some parts of the population

48
Q

If a sampling method is random what does this mean

A

Each member has an equal and fair chance of being selected

49
Q

What does independence mean

A

One outcome is unaffected by another outcome

50
Q

What is discrete uniform distribution

A
A random variable with an equal chance for each outcome 
P(X=x) = k
Where k is:
1
—————————-
Number of variables
51
Q

What are the requirements for a binomial distribution

A

Fixed number of trials
Two possible outcomes per trial
Constant probability
Independence

52
Q

What is the area under a normal distribution curve

A

1 or 100%

53
Q

3 standard deviations stat

A

99.7% lies within 3 sd of the mean

54
Q

What does it mean if a sample is truncated?

What can we do with this

A

Zero lies less than 2 standard deviations below the mean

Reject as not normally distributed

55
Q

What must you remember for normal approximations

A

p is close to 0.5
n is large
CONTINUITY CORRECTIONS

56
Q

Why do we do continuity corrections

A

To change discrete (binomial) data into continuous (for normal)

57
Q

Define H0

A

Population parameter you are comparing sample to

58
Q

H1

A

The claim of how the sample might differ from population parameter

59
Q

Population parameter

A

The value that defines a distribution
(For binomial is is ‘p’)
(For normal it is Mew and variance )

60
Q

Define critical value

A

The first value in the critical region for which sample results would have a chance below significance level of occurring

61
Q

Define critical region

A

The range of values for which H0 is rejected

62
Q

Define p value

A

Probability of the result from your sample in relation to assumed population

63
Q

Define significance level

A

The percentage for which any results below significance level suggests an unlikely outcome and therefore reasonable to conclude that the sample is unusual enough to reject H0

64
Q

Define test statistic

A

The value you get from your sample to compare with the critical value

65
Q

When does the critical region start exactly for binomial and normal

A

Binomial: critical value will be first value within critical region

Normal: critical region will always represent exact significance level

66
Q

When is PMCC not a good estimator

A

Outside original sample
PMCC is weak
Used to make a prediction about a different population

67
Q

P(A’)

A

Probability of A not occurring (compliment)