1 Statistics Basics Flashcards
Simple random sampling
Randomly selecting from everybody in the sample.
Stratified sampling
Creating different groups/strata and picking from each proportionally (to the overall group). Usually a large strata.
Systematic sampling
Chooses by selecting every nth term. The attribute being studied should be randomly distributed.
Convenience sampling
Based on ease of selection. E.g: people physically closer to you are more likely to be picked than someone in the back row who you can’t even really see.
Cluster random sampling
Divides population into different coherent areas then randomly select areas to assess.
Snowball sampling
Finding people who are suitable for the study and then asking them to refer others they know who would also be suitable for the study.
What is probability sampling
epresentative of the population as every individual has the same probability of being selected
For symmetric data we use…
mean and SD
For asymmetric data we use…
median and IQR
When a z scores used
when the values in question do not fall on specific reference ranges of the 68 rule.
Steps of a basic z score
- calculate the z scores.
- Search it in the table to find the corresponding area above these values.
- Use the overlap of area to find only the desired area.
What is a t distribution
Like normal distribution but takes into consideration degrees of freedom.
flatter/longer than a normal distribution peak.
- inc degrees of freedom
- inc sample size
the T distribution becomes more like the normal distribution.
What is degrees of freedom
(the number of data values that can change)
What is the central limit theorem
As n, the population, of a sample increases, the sample data is less likely to be skewed (more people = more likely outliers etc.).
The more samples we include on the mean distribution graph, the more it will look normally distributed, even if the initial data is skewed.
What is standard error
the standard deviation of the sampling distribution
Why is hypothesis testing used
analyse if the results in a sample are due to chance and if they are similar to the total population the sample came from.