Sampling Error and Bias Flashcards

1
Q

Why does increasing sample size reduce standard error?

A

The law of large numbers. Extreme values have less influence on the average. Kind of diluted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the 2 ways to increase power of a study?

A

Increase sample size

Reduce variability - sample from a more homogeneous population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are type I and type II errors?

A

Type I is where you wrongly reject the null hypothesis - thinking a difference exists when it doesn’t in reality.

Type II is where you wrongly accept the null hypothesis - - assuming no difference exists when it does in reality.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is random error and how is it measures?

A

The natural variation that occurs through a random sample. Measured by standard error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can you reduce the effect of random error?

A

Increasing sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the types of systematic error (bias)?

A

Measurement error, sampling error and reporting error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How can sampling/selection bias occur?

A

Sample drawn not representative of the population

  • undercoverage e.g. online surveys underrepresent elderly
  • sample frame error (when the sample frame includes people that would never be involved)
  • non-response bias (survey doesn’t account for non-response)

Basement characteristics of 2 groups to be compared not equal
-e.g. experimental group chosen and control are healthy volunteers (voluntary response bias)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How can measurement bias occur?

A
  • Variation in measurements
  • Different data collectors might vary in method
  • Instruments not correctly calibrated
  • Performance bias (e.g. cases more likely to have a knowledge of the disease and symptoms + better previous medical records)
  • Detection bias (e.g. investigators paying more attention to symptoms of those known to be in case/experimental group)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How can reporting bias occur?

A
  • Citation bias (not citing papers that contradict your argument)
  • Publication bias (not reporting non-significant results)
  • Language bias (only reporting English studies)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are types of sampling scheme?

A
  • Simple random sampling
  • Systematic sampling
  • Cluster sampling
  • Stratified sampling
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe the steps of simple random sampling

A
  1. Define and identify the survey population
  2. Define the sampling frame (all units in a list)
  3. Number each unit
  4. Determine the sampling size
  5. Randomly draw units until the sample size is reached (usually with a random number generator)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the advantages of simple random sampling?

A
  • Statistically the optimal method (each unit has an equal likelihood of being chosen)
  • Sampling error can easily be calculated
  • Simple to do
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the disadvantages of simple random sampling?

A
  • Creating a sample frame can be difficult (not always detailed records of population)
  • Can have logistical challenges if random units chosen are far from each other
  • Minorities can easily be missed out
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the difference between sampling with replacement or without?

A

Sampling without replacement means that the probabilities of being chosen after each unit is chosen so not equal probability of sampling. However sampling with replacement often makes no sense - e.g. don’t want the same person to fill out the questionnaire twice.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the steps of systematic sampling?

A
  1. Define and identify the sampling population
  2. Create the sample frame (e.g. population of 10,182)
  3. Arrange the units in a sequence (e.g. alphabetically by surname)
  4. Determine sample size needed (e.g. 320)
  5. Divide total sampling frame by sample size (e.g. 10,192/320 = 32 ish)
  6. Choose a random starting point (between 1 and 32)
  7. Draw units at regular intervals defined in step 5 (every 32nd unit after the first was chosen randomly)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the advantages of systematic sampling?

A
  • Ensures representativity
  • Simple to do
  • Sampling error easy to determine
17
Q

What are the disadvantages of systematic sampling?

A
  • Creating sample frame can be difficult
  • If there’s some sort of pattern in the ordered sampling frame then it can lead to a difference in probability of each unit/subgroup of unit being chosen (e.g. if sample frame was ordered male/female and the sample interval was even then the sample would include only 1 gender)
18
Q

Why would you use cluster sampling?

A

Because random sampling can be logistically challenging and it can be more practical so cluster the population and sample from representative clusters e.g. schools/community centres

19
Q

What are the steps of cluster sampling?

A
  1. List of potential clusters
  2. Create a cumulative list of all the units in all the clusters
  3. Calculate the systematic sampling interval (by dividing cumulative total population by number of clusters wanted)
  4. Choose random number at which to start (between 1 and sampling interval)
  5. Choose each unit at the sampling interval and the cluster that unit is in is the cluster chosen
  6. Continue until the right number of clusters
20
Q

What is the issue with variability in cluster sampling?

A

There’s a higher covariance inside clusters, meaning units within clusters are likely to be more similar to one another than to units outside the cluster (e.g. kids from same school likely to be from the same socio-economic group). This gives a high intra-class correlation coefficient. This gives a higher overall sample variance and therefore sample error. Can counteract by increasing sample size but can be inefficient.

21
Q

What are the advantages of cluster sampling?

A
  • More practical when dealing with a dispersed population

- Can be the only way to sample, if you don’t have a sampling frame

22
Q

What are the disadvantages of cluster sampling?

A
  • Co-variance problem (less variability between units within the clusters then outside) - greater covariance within groups. Increases variability and sample error - increased standard error and need a larger sample size.
  • Fewer clusters are logistically easier but gives more sampling error and a lower sample size
  • Given the way the clusters are chosen it is important each cluster is the same size so that none are more likely to be chosen
23
Q

Why might you choose a stratified sampling scheme?

A

If your population includes minorities at low frequency that your study requires to be represented

24
Q

How does stratified sampling work?

A

The sampling frame is divided into homogeneous subgroups (strata) and then the units are chosen from them using random sampling

25
Q

In stratified sampling, how is it ensured that the same representation of each sub-group in the main population is maintained in the sample?

A

Usually using probability proportional to size (calculate a sample fraction by dividing sample size by population - sample fraction is % e.g. 22% so take 22% of each subgroup in to the sample).
This can mean that the number of units are less than what is required from sample size calculations so can then either increase % for all (might lead to too high of a total sample size) or sample disproportionate to size by removing some from biggest cluster and using more from smallest cluster (leads to smaller group being overrepresented but effect can be corrected after)

26
Q

What are the advantages of stratified sampling?

A
  • Representation of minority groups
  • If variability within strata is more heterogeneous than overall population can give better precision (focus on each strata then synthesis results after)
  • Can have strata within strata but this can increase sample size needed
27
Q

What are the disadvantages of stratified sampling?

A
  • Can be very difficult to classify strata (not everyone fits to 1 clearly)
  • Hard to measure standard error
  • Sample sizes at individual level may be low, meaning high random error and potentially a loss of precision
28
Q

What should you keep in mind when choosing a sampling scheme?

A
  • Population to be studied (size/geographical distribution)
  • Availability of sample frame (is there a list of all units?)
  • Level of precision required
  • Resources available
29
Q

Why do you need to calculate sample size?

A

Too small:

  • May miss a significant effect (type 2 error)
  • Estimates of effect too imprecise
  • Unethical - put patients at risk for no scientific end

Too big:

  • Costly, wasting resources, takes too long to complete research
  • Unethical - give patients inferior treatment
30
Q

What do you need to consider when calculating sample size?

A
  • Gives an approximation of sample size (50s, 100s not 53 and 112)
  • Most assume simple random sampling
  • Different calculations depending on study
  • Assume random variation, won’t be appropriate if there’s systematic bias
  • Assumes very large populations
31
Q

What does a sample size calculation for a survey require?

A
  1. Confidence level (z-score)
  2. Precision you want (e.g. 10% = 0.1)
  3. Proportion
32
Q

What do you do if you have no estimates for proportion, and why?

A

Assume 0.5 (50%) because this will give the largest sample size

33
Q

What is the most appropriate measure to manipulate if you need a smaller sample size? (e.g. if low prevalence)

A

Precision

34
Q

What is the power of a study?

A

The ability to detect a difference between 2 groups

35
Q

What can increase the power of a study?

A

Increasing sample size and reducing variation (sampling from a more homogeneous population)

36
Q

Why is it important to have sufficient power?

A

Avoids type 2 error (missing a significant effect)

37
Q

What do you need to calculate sample size of a comparative study?

A
  1. Threshold for a significant result (stops type 1 errors)
  2. The power of a study (usually 80-90%)
  3. The baseline level measure of interest (usually of control)
  4. The minimum effect size you are aiming to detect (e.g. clinical significance)