Data Sampling Flashcards
What are the different sampling methods?
- Population
- Random
- Systematic
- Stratified
- Cluster
- Consecutive
- Convenience
What is a Population sample?
Measuring everyone in the population
e.g. facebook users - have the entire population
rare to have such a large and rich dataset
What is Random sampling?
Random process to select a sample
What is a Systematic sample?
Apply a rule to pick the sample e.g. every 5th person
What is a Stratified sample?
In different layers/strata
e.g. 10 from england, 10 from wales
or 10 women 10 men
What is a Cluster sample?
Go to one place and sample e.g. one hospital
What is a Consecutive sample?
E.g. ‘start on Tuesday and collect people until 100 is reached.
What is Convenience sampling?
e.g going downstairs to the cafe now to pick a sample
What is the Source Population?
The overall population that we are trying to study e.g. patients on chemo in the southwest
Statistically, a population has a value (e.g. mean etc) which can almost never be truly known, but we can estimat eit
What is the study population?
Where is the sample coming from? e.g. a database of all patients on chemotherapy in the southwest
How does sample size effect the estimated mean?
If the sample is random, the larger the sample size, the closer the mean of the sample will be to the mean of the actual overall population
What is the Sampling Distribution of the mean?
- Small sample from a normally distributed population is taken (e.g. 10)
- Mean is found of this population
- Repeat steps 1 and 2 lots of times
- Plot the distribution of the means - will look like a normal distribution
- The mean of the sample means = the population mean
AKA if we take loads of repeated samples we can eventually guess something we can’t actually measure
What is the Standard Deviation?
Descriptive statistic
- Standard deviation of a sample from a population (descriptive statistic)
- It measures the variability or ‘width/spread’ of the population data
- It does not change as the sample gets larger
What is the Standard Error?
Inferential statistic
- Theoretical **standard devation of the sampling mean distribtution **(all the means of samples plotted)
- Gives a measure of the **precision ** of the estimate of the mean (aka if the standard error is smaller the closer the mean of the means is to the true mean of the population, the more accurate your guess is)
- It is always smaller than the standard devation of the sample
- It gets smaller as the samples get larger - as the samples get bigger the closer their means are to the true mean
- **It gets larger as the standard deviation of the population gets larger **- if the sample is very varied e.g. heights of all humans on earth would have a bigger standard error than heights of all adults on the earth
What is the formula for the standard etror?
SE = SD/√n
SE is always smaller than SD because SD has been divided