Statistics Flashcards
What is a sample
Some subset of the population intended to represent the population
What is a sampling unit
Each individual in the population that can be sampled
What is a sampling frame
Numbered or named to form a list
Advantages of using a census
Should give completely accurate result.
Disadvantages of using a census
Time consuming and expensive.
Can not be used when testing involves destruction.
Large volume of data to process.
Advantages of using a sample
Cheaper
Quicker
Less data to process
Disadvantages of using a sample
Data may not be accurate.
Data may not be large enough to represent small sub-groups.
How to carry out simple random sampling
In sampling frame each item has identifying number. Use random number generator, or ‘lottery sampling’ (names in a hat
What is simple random sampling
Every sample has an equal chance of being selected
Advantages of simple random sampling
Avoids bias
Easy and cheap to implement
Each nu. Has an equal chance
Disadvantages of random sampling
Not suitable when population size is large
Sampling frame is needed
How to carry out systematic sampling
Elements chosen at regular intervals
Advantages of systematic sampling
Simple
Quick
Suitable for larger samples / populations
Disadvantages of systematic sampling
Sampling frame needed
Can introduce bias is sampling frame is not random
When is stratified sampling used
Used when sample is large + pop. Naturally divides into two groups
How to carry out stratified sampling
Population divided into groups (strata) and a simple random sample carried out in each group.
Same proportion (𝑠𝑎𝑚𝑝 𝑠𝑖𝑧𝑒 (𝑛))/(𝑝𝑜𝑝 𝑠𝑖𝑧𝑒 (𝑁) ) sampled from each strata.
Used when sample is large and population naturally divides into groups.
Advantages of stratified sampling
Reflect population structure
Guarantees proportion representation of groups within the population
Disadvantages of stratified sampling
Population must be clearly classified into distinct strata.
Selection within each stratum suffers from same disadvantages as simple random sampling.
How to carry out quota sampling
Population divided into groups according to characteristic. A quota of items/people in each group is set to try and reflect the group’s proportion in the whole population. Interviewer selects the actual sampling units.
Advantages of quota sampling
Allows small sample to still be representative of population.
No sampling frame required.
Quick, easy, inexpensive.
Allows for easy comparison between different groups in population
Disadvantages of quota sampling
Non-random sampling can introduce bias.
Population must be divided into groups, which can be costly or inaccurate.
Increasing scope of study increases number of groups, adding time/expense.
Non-responses are not recorded.
How to carry out opportunity sampling
Sample taken from people who are available at the time of study, and who meet the criteria
Advantages of opportunity sampling
East to carry out
Inexpensive
Disadvantages of opportunity sampling
Unlikely to provide a representative sample
High dependent on individual researcher
What is range
+ how to calculate e
It’s a measure of variation
Highest - lowest value
What are percentiles
Divide data in 10
Eg.
Q1 = 25th percentile
How to find a interpercentile range
Subtracted 2 percentiles
How to find the mean
Total
——-
How many
Or
EX
—
n
How to find mean from a table
Mid point x frequency
——————————
n
What is the median
Middle data value when all the data values are placed in order of size
What is the mode
Most frequently occurring data value
Definition of an outlier
An observation that lies outside the overall pattern of distribution
Give a reason to support the use of histograms to represent this data
……. Is continuous
Grouped
What are the features of a histogram
No spaces between bars
Area is proportional to frequency
frequency density equation
FD = F / CW
What does area = on a histogram
Area = K x frequency
Statistical distributions
What’s does X mean
A random variable
(It doesn’t have a fixed value)
Statistical distributions
What does x mean
A particular value X can take
What is a descrete random variable
Only have a certain number of possible values
What do all probabilities add ot
1
How do you tell which is the mode In terms of probability
the value with the biggest probability
Statistical distributions
What is the binomial coefficient equation
(n) n n!
= C = ———-
(r) r r! (n-r)!
Probability
What does n mean
2 shaped areas
Probability
What does u mean
1 or 2 shaded areas
What does mutant exclusive mean
No overlap
What is the mutatly exclusive equation
P(AuB) = p(A) + p(B)
What does independent eventsmean
No effect on each other
Independent events equations
p(AnB) = P(A) x P(B)
What is the formula of the regression line
Y = a + bx
What is interpolation
Is it reliable
Values of x within the data range
It’s reliable
What is extrapolation
Is it reliable
Use of values of x outside the data range
It’s unreliable
What variable do you put on the xaxis
Independent
What value do you put on the y axis
Dependent
What is a hypothesis
A statement made about the value of a population parameter that we wish to test by collecting evidence in the form of a sample
What is a null hypothesis
The default
What is an alternative hypothesis
That there has been some change in the population parameter
What is a test statistic
The evidence from the sample
What is the level of significance
Is the maximum probability where we would reject the null hypothesis
What is the critical region
The range of values f the test statistic that would lead to you rejecting H0
What is the actual significance level
The actual probability of being in the critical region