Feldmand. Module 5 - Sample size, power & tests Flashcards

1
Q

Effect of an inadequate Sample Size

A
  • If a study has an inadequate sample size, then a result with a null finding (no statistically significant association detected) is uninformative
  • A true lack of association is difficult or impossible to distinguish from a true association that cannot be detected statistically because of inadequate power
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Type I error

A

[2x2 table]
– α is the false-positive error rate, the probability of making a Type I error (α is the level of p-value at which you reject H0, often 0.05)
– Even if H0 is true, in repeated samples, we will reject H0 a proportion α of the time (we detect a difference when one doesn’t exist)

H0: you’re not pregnant. (rejected)
H1: YOU’RE PREGNANT — accepted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Type II error

A

– β is the false-negative error rate, the probability of making a Type II error
– Traditionally, β = 0.20. Thus, there is a 20% chance of failing to reject H0 when the alternative is true.

H0: YOU’RE NOT PREGNANT. — accepted
H1: you’re pregnant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Power

A
  • The power of a test (1–β) is the probability of rejecting H0 when HA is true, i.e., detecting a difference when one really exists!
  • We want power to be as large as possible
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Trade-off Between α and β

A
  • Both types of error should ideally be minimized
  • However, a decrease in one type of error is achieved at the expense of an increase in the other
  • For a given α and magnitude of effect (RR or OR), β can be reduced only by increasing the sample size
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Sample Size Considerations

A

• Should be done at start of study to ensure enough power based on different calculations depending on whether estimates for a categorical or a continuous variable are to be calculated.

The factors to take into consideration include:

  • accuracy required,
  • sampling method to be used,
  • size of the smallest subgroup
  • actual variability of the variable of interest in the population.

• To calculate need to know:
– Desired values for the probabilities of α and β
– Baseline (nondiseased or nonexposed) exposure or outcome rates
– Expected magnitude of effect (RR or OR): • Often based on previous studies or reports • The minimum effect the investigator considers worth detecting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is precision (d)?

A

This parameter is the distance of the sample estimate in either direction from the true population proportion considered acceptable by the investigator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Sample size formula for Estimation of level of disease occurrence

A

n = zα^2 * [ P(1-P) / d^2]

If the sample size n is greater than 10% of the total population size, correction:

new n = 1 / [1/n* + 1/N]

n* = n obtained above, N = population size

[assuming a 95% CI (z=1.96)]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Sample size formula to detect disease

A

a: Finite populations:

n = [ 1 - ( 1 - β)^(1/d) ] [ (N - d/2) + 1/2 ]

(β = confidence level (as proportion) -> probability of observing at least one diseased, if prevalence is d/N; N = population size, n=sample size; d = number of diseased)

b: Infinite populations (> 1000):

n = [ log(1 - β) ] / [ log( 1 - (d / N) )]

(n=sample size, β = level of confidence, d= number of
diseased, N=population size)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Sample size calculation for: Probability of not detecting disease

A

In the case of importation of animals, it may be necessary to quantify the probability of failure to detect any positives in a sample from an infinite population. The assumption for the formula is that population size is infinite and prevalence (prev).

p = ( 1 - prev )^n

(p = probability of failure to detect positives; n = sample size, prev=prevalence)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Sample size for estimation of continuous- type outcome variable

A

n = [ (zα - zβ) * sd / L ]^2

zα = 1.96 if P=0.05,  zβ = 1.28 if power = 0.90; zβ = 0 if power = 1
L = how accurate estimate is supposed to be expressed in units of parameter of interest
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Test Accuracy

A
  • How good is the test at identifying individuals with and without the disease?”
  • The sensitivity of the test is the likelihood of a positive test among those with the disease (How good is the test at identifying individuals WITH the disease?)
  • The specificity of the test is the likelihood of a negative test among those without the disease (How good is the test at identifying individuals WITHOUT the disease?)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Test Accuracy

A

• To calculate the sensitivity and specificity, we must know the truth in the population from another source, a gold standard
– May be another test result that has been in use, and sometimes it is the result of a more definitive and often more invasive test
• There is an inverse association between sensitivity and specificity, and therefore one must trade one for the other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Predictive Value

A

• Predictive value positive (PVP)
– If the test results are positive for this patient, what is the probability that this patient has the disease?
• Predictive value negative (PVN)
– If the test result is negative, what is the probability that this patient does not have disease?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do test characteristics vary?

A

• Sensitivity and specificity are test characteristics and do not vary
• Predictive value, however, affected by:
– Prevalence of disease: • Low prevalence of disease results in low predictive value • Test results must be interpreted in the context of the disease prevalence in the population • Most productive and efficient to use test in “high prevalence” populations, e.g., high risk
– Specificity of test: The higher the specificity the higher the predictive value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly