Module 11 Flashcards
What is one of the primary purposes of epidemiology?
To find the causes of dz
-The discovery of risk and protective factors associated with health outcomes statistically
-Which cause or prevent the outcomes
What branch of epidemiology can help find the etiology of a dz?
Analytic epidemiology
-Tries to test “a priori” hypotheses (theoretical ideas) about the causes of determinants of health in populations
What questions do analytic studies answer?
Is there a statistical association between exposure and dz?
Is the association “real”, i.e., causal?
This is to r/o the 1st of the possible explanations for the results: That the results were entirely d/t chance or random occurrences.
Parameter definition
Actual indicator in the entire source pop or pop of interest
Statistic definition
Indicator that the investigator found in the sample or study pop
What are the major sources of error in epidemiologic research?
Random (or chance) errors (reliability)
Systematic errors (bias)
What is random error?
Is another way of saying there is lack of reliability in the study
What does high random error indicate?
Low reliability
What does low random error indicate?
High reliability
Imprecise results are __________
Unreliable results
How are imprecise results unreliable results?
If the results are imprecise, the possibility of random error is also high.
How do imprecise results occur?
When the factor being measured is not measured sharply
Analogous to aiming a rifle at a target that is not in focus
How can precision be increased?
By increasing sample size or the number of measurements
In what ways do random errors reflect fluctuations occurring by chance around a true value of a parameter?
Sampling error
Variability in the data itself
Imprecision in measurement
Reliability relates to ______
Repeatability
When does sampling error occur?
Brief reminder: epidemiologic studies draw inferences about the experiences of an entire pop based on an eval of only a sample
Occurs when the sample selected is not representative of the source pop for reasons other than systematic bias in the way ppl were recruited.
How is sample size related to chance?
Characteristics of participants in a sample may vary from sample to sample. As a result, an association between an exposure and outcome, or lack thereof, may vary by chance.
Sample size is directly related to chance. The larger the sample size, the more pop variability it can reflect.
To minimize sampling error d/t chance, increase the sample size.
How does variability affect error?
If there is a lot of variability in the data, there is more possibility of random error than if the data are less variable.
If the pop contains a lot of variability, the chances of drawing an unrepresentative sample go up.
Imprecision in measurement details
The lack of agreement in results from time to time reflects random error inherent in the type of measurement procedure employed.
What is an example of imprecision in measurement?
Two readings of BP of the same persons using the same instrument may be different
How do we decrease random error d/t imprecision in measurement?
Take more measurements and average them
When is it difficult to get a statistically significant result?
When the sample size is small, or measurements have wide variation and/or are imprecise
What makes it harder to reject the null hypothesis?
Increasing random error
What do statistically insignificant results not represent?
A study limitation, unless the study was statistically underpowered (sample size was too small) or measurements were too imprecise
Definition of hypothesis
A tentative suggestion that certain associations exist
How is hypothesis used in epi?
To eval suggestions about cause-effect relationships
On what are hypotheses based?
On learned and scientific observation from which theories or predictions are made
Development of hypotheses is based on scientific approaches, using facts in the analyses and in a manner that makes common sense, based on rational scientific knowledge.
What type of reasoning does hypothesis generation involve?
Inductive reasoning
No “a priori hypothesis”
Often based on descriptive or preliminary data
Leads to the creation of a tentative hypothesis
What type of reasoning does hypothesis testing involve?
Deductive reasoning
There is an “a priori” hypothesis
Confirmed (or not) in an analytic study
Seven steps in hypothesis testing for assessing an association
- Formulate the null hypothesis
- Formulate the alternative (a priori, research) hypothesis
- Set the significance level (0.05) and sample size
- Recruit the sample.
- Collect the data
- Analyze the data (odds ratio, relative risk, etc.) and calculate the test statistic (p-value or confidence interval)
- Either reject or fail to reject the null hypothesis and draw conclusions from the results.
Null hypothesis definition
There is no difference among the groups being compared
There is no relationship between the exposure and the dz
How is the research question expressed?
The alternative hypothesis expressed in question form
Calculating significance tests
In order to decide whether or not to reject or fail to reject the null hypothesis, a test statistic is computed and compared with a critical value obtained from a set of statistical tables.
However, calculating them is only the first step. Is our odds ratio statistically significant?
Significance level: α
This is arbitrarily set. Usually is 0.05, but could be 0.01, 0.001, or 0.1
The significance level is the chance of rejecting the null hypothesis when, in fact, it is true (type I error)
The P-value, what is it?
All tests of statistical significance lead to a probability statement, which is usually expressed as a p-value.
P-value definition
The p-value indicates the probability that findings at least as extreme as those observed were likely to have occurred by chance alone, given there is truly no relationship between exposure and dz.
What does a p-value really tell you?
In other words, if the null hypothesis is true, and there really is NO relationship between the exposure and the dz, what is the probability of getting the result that you did by chance?
What does a p-value of 0.05 mean?
Means that 5% of the time we would get a result as extreme or greater than the one we got
95 times out of 100 we would get a result that is less extreme
Only 5 times out of 100 it would be more extreme
If the exposure is NOT related to the dz
Rejecting the null hypothesis
If p <0.05, we conclude that chance is an unlikely explanation for the finding.
-These results would occur by chance only very rarely
The null hypothesis is rejected and the statistical association is said to be significant.
Failing to reject the null hypothesis
If p ≥ 0.05, we conclude that chance cannot be excluded as an explanation for the finding
-And we fail to reject the null hypothesis
-That is:
–These results are likely to occur by chance at least 5% of the time.
Statistical significance and chance
A statistically significant finding does not mean that the results DID NOT occur by chance.
-Only that it is unlikely that the findings did occur by chance
A non-significant finding does not mean that there is not an association
-Only that the probability that our results are due to chance is too high for us to feel comfortable enough to reject the null.
In statistics, nothing is ______
Absolute
No p-value, however small, completely excludes chance.
No p-value, however large, completely mandates (ensures) chance
Definitions of confidence interval
A range of values that one can say, with a scientific degree of confidence, contain the true pop value (the parameter)
A computed interval of values that, with a given probability, contain the true value of the population parameter
Confidence interval indicates the range within which the true magnitude of effect (parameter) lies with a certain degree of assurance
Format of confidence interval
The degree of confidence is usually stated as a percentage
Commonly the 95% confidence interval is used. This corresponds to an alpha of 0.05
What two things does confidence interval tell us?
If the statistic (called the point estimate) is statistically significant compared to a predetermined cut-off point (same as a p-value)
How precise the statistic (or point estimate) is
What is the null value?
The value indicating no association
For all ratios, such as relative risks, odds ratios, prevalence ratios, standardized mortality rates: null = 1.00
For correlation coefficients or other proportions: null = 0.00
Rejecting the null hypothesis if the null value is not included inside the confidence interval
For ratios, this means that 1.00 is NOT in the confidence interval
Then p < the value of alpha that we choose
We say the results are statistically significant
Failing to reject the null hypothesis if the null value is included in the confidence interval
For ratios, this means that 1.000 IS in the confidence interval
Then, the corresponding p-value is, by definition, > alpha.
We cannot reject the null hypothesis, and we say the results are not statistically significant.
Confidence intervals give all of the information of a p-value plus…
The expected range of effect sizes
What our odds ratios or relative risks might have been if we had a different sample of ppl
What is the width of the confidence interval influenced by?
The variability of the data, sample size, the precision of your measurements, and the percentage of the confidence interval that you choose.
Narrow vs wide confidence intervals
A narrow confidence interval is very precise. A wide one lacks precision.
Confidence intervals get narrower if your data is more reliable, meaning there is less random error.
-Confidence intervals get narrower if there are more ppl in your sample
-Confidence intervals get narrower (more precise or more certain) if the underlying data have less variation/scatter
-Confidence intervals get narrower if measurements are precise
Meaning of a 90% confidence interval
If we drew 100 samples and repeated the study 100 times….
And it is a 90% confidence interval,
90 times out of 100, our odds ratio would fall within this confidence interval
-10 times out of 100, our results would be outside the original confidence interval
-We can be 90% sure that the true value (parameter) is within this confidence interval (given there is no bias)
-alpha = 0.1
What to take into account with confidence intervals that have amounts that are below 1
They appear more narrow than they actually are
Are statistical and clinical (or practical) significance the same?
No
In very large samples, while small differences in dz frequency or low magnitudes of relative risk may be statistically significant, they may have no clinical or practical significance.
Conversely, with small sample sizes, large differences or measures of effect may be clinically important, even though they may not be statistically significant.
Statistical significance depends partly on what?
Sample size
A small difference (or weak odds ratio, relative risk, correlation coefficient, etc.) may achieve statistical significance if the sample size is large.
A large difference (or strong odds ratio, relative risk, correlation coefficient, etc.) may not achieve statistical significance if the sample size is too small. Insufficient statistical power.
What is the meaning of statistically significant?
A statistically significant association (usually p<0.05 or 95% CI that does not include 1.00) means that the likelihood of getting the results that we got purely by chance is low
So, we reject the null hypothesis, and say our results were unlikely to have been d/t chance, and were driven by something, but we still do not conclude they were causal, or always important
What does statistical significance indicate?
We have jumped the first hurdle in hypothesis testing.
We can then say that there appears to be an association between exposure and outcome, and that this association is not likely to be d/t chance or random error
We cannot yet determine cause
What do statistical tests not tell you?
If the study design is valid
If the results are clinically, biologically, or sociologically meaningful
If the association is causal
Definition of attributable risk
The quantity of dz incidence (or mortality) that can be attributed to a specific exposure
Question and purpose of relative risk
Is there an association between an exposure and a dz?
Valuable in etiologic studies
Questions and purpose of attributable risk
Given that the exposure is a cause of the dz, how much of the dz can be attributed to the exposure?
What is the public health impact of this exposure?
How much of the risk (incidence) of dz can we hope to prevent if we are able to eliminate exposure to the agent in question?
Valuable in the making of clinical or policy decisions and evaluating the impact of a prevention program
What is the purpose of risk difference and attributable risk among the exposed?
Used to guide clinical advice
What is another name for attributable risk among the exposed?
Etiologic fraction
What is the purpose of population risk difference and population attribute risk?
Used to guide policy decisions or to help decide whether to implement a given intervention
What is background risk?
A certain amount of dz occurs in the not exposed. This is the background risk
The attributable risk uncovers how much is attributable to the exposure, apart from the background risk.
What is incidence in the exposed a combination of?
Incidence not due to the exposure (background incidence) + incidence due to the exposure
Risk difference formula
Incidence in the exposed group - incidence in the not exposed group
or
Ie- Ine
Interpretation sentence for risk difference
_____ per _____ of the incident cases of ______ among ______ are attributable to their _______.
Attributable risk among the exposed formula
Risk difference/incidence in the exposed group
or
(Ie - Ine)/ Ie
Interpretation of attributable risk among the exposed
____% of incident cases of _____ among _____ are attributable to their _____.
What questions are asked in terms of population attributable risk?
What would the impact on the entire pop be if _____ were eliminate?
How much (or what proportion) of a dz in the entire pop could be prevented if we could eliminate this exposure?
By how much would the overall dz rate decrease?
What is the impact of an entirely successful health promotion program to eliminate an exposure?
Definition of population attributable risk
The proportion of the rate of the dz in the pop that is d/t the exposure.
Impact of removing an exposure from a population: prevalence of exposure
If a higher proportion of ppl have the exposure, the impact on the population of eliminating it will be higher.
Impact of removing an exposure from a pop: relative risk
If the association between exposure and dz is stronger, the impact on the pop of eliminating it will be higher.
Impact of removing an exposure from a pop: incidence rate of the dz
The higher the incidence rate of the dz, the greater the number (not proportion) of cases that could be prevented
Influence of the prevalence of the exposure on the population attributable risk
If the exposure to be eliminated is smoking, and 100% of the pop smokes, the population attributable risk - the attributable risk among the exposed (all are exposed).
If 0% of the population smokes, the population attributable risk = 0 (none are exposed).
If 1-99% of the pop smokes, the impact lies between 0 and the attributable risk among the exposed
Importance of the prevalence of exposure to the population attributable risk
If the prevalence of exposure in that pop is low, regardless of a high relative risk, the impact of eliminating that exposure will also be low.
If the prevalence of exposure in that pop is high, regardless of a moderate relative risk, the impact of eliminating that exposure can also be high.
Of what is incidence in the total population a combination of?
Incidence due to all other causes
and
Incidence d/t the exposure of interest
Definition of risk difference
The absolute amount of incidence of dz in the exposed that is attributable to the exposure
Population risk difference definition
The absolute amount of dz incidence in the total pop attributable to the exposure
Population risk difference formula
Incidence of the dz in the entire pop - incidence of dz in the not exposed
or
Ip - Ine
Interpretation of pop risk difference
_____per _____ of the incident cases of _____ in the entire pop are attributable to ______.
Population attributable risk definition
The proportion of risk in the entire pop d/t the exposure
Population attributable risk formula
Population risk difference / incidence of the dz in the entire population
or
(Ip - Ine)/ Ip
Interpretation of population attributable risk
___% of incident cases of ___ in the whole population are attributable to _____.