Biostat Flashcards

1
Q

What is surveillance?

A

The systematic (ongoing) collection of relevant data (disease, injury, hazard) and their constant evaluation and dissemination to all who need to know (for the purpose of prevention)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the surveillance cycle?

A
  1. Plan a change or test
  2. Do the change or test
  3. Observe effects
  4. Study the results
  5. Repeat
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the goal of surveillance?

A

Continuous improvement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the levels of prevention?

A

Primary, secondary, tertiary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Described primary prevention

A

Predisease (no known risk factors or disease susceptibility)
Examples: health promotion activities such as exercise and specific protections such as immunizations, automobile safety measures, recommended nutritional supplements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe secondary prevention

A

Latent disease
Example: screening (in populations and of individuals) for early detection of disease and early treatment of disease (e.g. mammography)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe tertiary prevention

A

Symptomatic disease (initial care, subsequent care)
Examples: Disability limitation (e.g. medical or surgical treatment to limit damage from a disease)
Rehabilitation (e.g. rehabilitation after a stroke)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Sentinel health event

A

An unnecessary disease, disability, or untimely death which is preventable and whose occurrence serves as a warning signal that preventative and/or medical care may need to be improved

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are some goals of surveillance?

A

Estimate magnitude and determinants, targeted intervention, track trends and distribution, identify failure of prevention (sentinel health events)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What morbidity measures are used to describe disease occurrence?

A

Incidence (cumulative incidence and incidence density) and prevalence (period prevalence and point prevalence)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is incidence?

A

An estimate of the risk or probability of developing a disease during a specified time period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Incidence density

A

of new cases during a specified time period
___________________________________
population at risk of disease during the same time period (also measured as person-time)
(x 1,000)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Cumulative incidence

A

of new cases during a specified time period
___________________________________
population at risk
(x 1,000)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Which (incidence density or cumulative incidence) is more precise?

A

Incidence density

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is prevalence?

A

Describes the burden of disease in a population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Prevalence calculation

A

total # of cases of disease during a time period (or at one point in time)
____________________________________________
total (usually mid-period) population during the same time period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the relationship between incidence and prevalence?

A

When the disease is stable:

Prevalence = incidence x duration of disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What happens when new treatments that increase longevity of a particular disease are discovered?

A

The prevalence of the disease will increase since, even if incidence rates remain the same

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the main sources of morbidity data?

A

Public health surveillances, health surveys, registries

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are the 2 main surveillance systems?

A

Active and passive
Active involves outreach by some public authority (most complete and accurate, but expensive)
Passive relies on physician to report

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Sentinel surveillance

A

A surveillance system that uses a prearranged sample of sources who have agreed to report all cases of one or more notifiable diseases
Often uses largest hospitals in a geographic area
Data are not generalizable to the geographic population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Syndromic surveillance

A

Developed for early detection of a large-scale release of a biological agent, current surveillance goals reach beyond terrorism preparedness
Focuses on the early symptom (prodrome) period before clinical or laboratory confirmation of a particular disease
Gathers information about patients’ symptoms during the early phases of illness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are health surveys also called?

A

Prevalence studies

Since they allow for the estimation of the proportion of the population with a particular health problem

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are the limitations of morbidity data in the U.S.?

A

Severity of illness (only more severe are likely to be reported), access to care, validity of screening test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is ICD-10?

A

Codes that are used to classify all causes of death on the death certificate
Promotes international comparability of mortality statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What effects accuracy of death information?

A

Who fills out the form
If they follow the instructions
If they were the patient’s private physician
If an autopsy was performed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is crude death rate? What is it poor in?

A

A very rough measure of the level of morality in a population
A particularly poor measure when comparing 2 or more populations which have differing age distributions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is the calculation for crude death rate?

A

of deaths in one year
___________________ (x 1,000)
total mid-year population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Where is info needed to calculate crude death rate taken from?

A
# of deaths ----> Vital Registration System
Total mid-year population ----> Census Bureau
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Age-specific death rate

A

of deaths in one year to age group a
______________________________ (x 1,000)
mid-year population of age group a

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is the calculation for infant morality rate? What is it a good indication of?

A

number of deaths to children under 1 year
________________________________ (x 1,000)
total live births
Good indicator of health of a population because it tells you about services available to mothers and babies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Cause-specific death rate

A

Deaths due to a cause during a specified time period
______________________________________(x 100,000)
total population during that time period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is case fatality rate?

A

Presents the risk of dying during a defined period for those who have a particular disease
Often used during a disease outbreak
Can be used for non-infectious diseases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Case fatality rate calculation

A
# of deaths during a specified time period after disease onset
\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_ (x 100)
# of individuals with that disease during that time period
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is proportionate mortality?

A

Presents the proportion of total deaths that are due to a specific cause
Tells us, within a population, the relative importance of specific cause of death in the total mortality picture
Each cause is expressed as a percentage of all deaths, and the sum of the causes must add to 100%
These proportions are not mortality rates, because the denominator is all deaths rather than the population in which the deaths occurred

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Proportionate mortality calculation

A

of deaths due to cause x during a specified time period
______________________________________ (x 100)
total # of deaths during that time period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Proportionate mortality ratio (PMR)

A

Comparison of 2 proportionate mortalities
A PMR greater than 1 indicates that a particular accounts for a greater proportion of deaths in the population of interest than you might expect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What is a confounder?

A

A variable which is related to both study variables and obscures the relationship b/w the variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What are the 2 methods for controlling for a confounding factor?

A
  1. Calculate specific rates (stratification)
  2. Use an adjustment or standardization procedure; these procedures allow for adjustment of confounders while providing a summary measure that is easy to work with
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What are the 2 types of standardization procedures?

A
  1. Direct method of rate adjustment - choose a standard population distribution; calculate adjusted rates by applying the age specific death rates to a standard age distribution
  2. Indirect method of rate adjustment - choose a standard set of rates; calculate standardized mortality ratios by applying a standard set of rates to the age distribution of populations of interest
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What are the outcomes of the 2 standardization procedures?

A

Direct method of rate adjustment - directly adjusted rate

Indirect method of rate adjustment - Standardized Mortality Ratio (SMR)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Standardized Mortality Ratios (SMR)

A

Ratio of the number of observed deaths to expected death (if your groups experienced the mortality rates of a standard population) often expressed as a percentage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

What are advantages and disadvantages of crude rate?

A

Advantages - Simple to calculate

Disadvantages - Does not calculate for the impact of confounders

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What are advantages and disadvantages of specific rate?

A

Advantages - Controls confounders; can see more details

Disadvantages - Need detailed data for rates; cumbersome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

What are advantages and disadvantages of directly-adjusted?

A

Advantages - controls confounders; summary measures

Disadvantages - need detailed data for rates; can miss details; not a real rate (relative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

What are advantages and disadvantages of indirectly-adjusted (SMR)?

A

Advantages - controls confounders; summary measure; fewer data needs
Disadvantages - SMR is not a rate; can miss details

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

When to use which method of rate adjustment?

A

Specific rates - if data are of good quality and few comparisons are needed
Direct method - if summary measure is preferred and data quality is good
Indirect method - use with very small numbers, because rates are unstable

48
Q

What factors can rates be adjusted for?

A

Any factor that might confound results, not just age

49
Q

What are the 2 general types of life tables?

A

Cohort life tables

Period life tables

50
Q

Cohort life tables

A

A real group of people or cohort who are followed over time to profile their mortality experience

51
Q

Period life tables

A

A hypothetical population of 100,000 who experience current mortality trends
The probability that an individual will die during any particular year is calculated, using age-specific death rates
End result of this table is the life expectancy

52
Q

Descriptive epidemiology

A

Generates hypotheses

Why this distribution in person/place/time?

53
Q

Case series

A

Several cases strung together of a certain time

54
Q

Analytic epidemiology

A

Tests hypotheses

  1. Concerned with the determinants of disease-etiological factors
  2. Concerned w/ explaining different disease rates in different populations
  3. May form the basis for control of disease
55
Q

Common characteristics in cohort

A

Date of birth
Exposure
Disease
Treatment

56
Q

Relative risk

A

Rate in exposed persons/rate in non-exposed persons

57
Q

Attributable risk

A
Incidence rate (exposed group) - incidence rate (unexposed group)
Attributes risk to that exposure
58
Q

Population attributable risk proportion

A
Incidence rate(total population) - incidence rate (unexposed group)
\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
Incidence rate (total population)
59
Q

What is the null hypothesis regarding relative risk?

A

Relative risk = 1

60
Q

What is the null hypothesis regarding attributable risk?

A

Attributable risk = 0

61
Q

Relative risk asses what of a risk factor?

A

Etiological strength

62
Q

Attributable risk asses what of a risk factor?

A

Public health impact

63
Q

What is effects of non-response in a cohort study with respect to exposure (yes) and outcome (no)?

A

Incorrect estimate of exposure

64
Q

What is effects of non-response in a cohort study with respect to exposure (no) and outcome (yes)?

A

Incorrect estimate of outcome

65
Q

What is effects of non-response in a cohort study with respect to exposure (yes) and outcome (yes)?

A

Incorrect estimate of association, exposure, and outcome

66
Q

What is effects of non-response in a cohort study with respect to exposure (no) and outcome (no)?

A

No problem

67
Q

Confounding variable

A

A variable associated w/ both outcome and risk factor(s) of interest

68
Q

How can confounders be dealt with?

A

Matching (efficient, difficult, $$$)
Stratification (less efficient, less difficult)
Regression (after the fact)

69
Q

Matching to deal w/ confounders

A

○ For every person who is a smoker and works in a coal mine, they are matched w/ a non-smoker who works in a coal mine; they are paired
○ Differ only in this one thing (smoking vs. non-smoking)
○ Can see differential in lung cancer rates and now it isn’t due to confounding variables since the smokers and non-smokers are matched in all the other things
○ Efficient since w/ relatively few people can pick up effect of smoking

70
Q

Stratification to deal w/ confounders

A

○ Not as powerful statistically
○ Instead of waiting for one-to-one matches, just do the study any way you want and then divide it into stratas at the end
○ Can make batches (such as those who don’t have coal mining as occupation, and comparing smoking and non-smoking)
○ Ratio of death holds up because looking at them in different batches (one batch could be coal-mining and another batch could be non-coal-mining)

71
Q

Regression to deal w/ confounders

A

○ Can do a regression equation that predicts risk w/ and w/o the confounding variable
○ Can get the statistical significance of things

72
Q

Case-control study

A

Assemble data based on outcomes; start w/ batch of cases and batch of controls and ask them if the suspected cause was present

73
Q

Odds ratio

A

Odds ratio = (AD)/(CB)

74
Q

Population attributable risk (case-control studies)

A

p (odds ratio - 1)
_____________
p (odds ratio - 1) + 1

where p = b/(b+d) and is an estimate of the proportion of the population exposed

75
Q

Compare case-control studies and cohort studies

A

Case-control studies: done w/ smaller samples, relatively cheap, relatively fast to complete, better for rare diseases, subject to more bias in exposure info, subject to incomplete data due to lack of recall or record, able to provide data on the odds ratio as an estimate of relative risk, but no incidence

Cohort studies: done w/ larger samples, relatively expensive, relatively slow to complete, better for rare exposures, subject to more bias in disease diagnosis, subject to incomplete data due to loss of follow-up, able to provide data on the relative risk, as well as the incidence of disease outcome

76
Q

Cross-sectional study

A

Presence of disease or infectious agent int a given population at a given time. Gives population prevalence. At best, may describe temporal association (at that time). Provides no info about:

  1. risk factors
  2. transmission of disease
  3. duration of disease
  4. outcome of disease
77
Q

Selection bias

A

Choosing which drug you take, based on symptoms (“confounding by indication” is a subset of this) - people with worst symptoms pick most-active drug

78
Q

Sampling bias

A

Who volunteers to be in a study in the first place? Health-aware or needing money/treatment

79
Q

Recall bias

A

People w/ disease might be motivated to remember risk factors

80
Q

Lead-time bias

A

Can detect tiny nodules, so time from detection/diagnosis to death increases, but disease treatment might not necessarily be more effective

81
Q

Surveillance bias

A

Knowing that you are in a study improves your health behaviors/adherence

82
Q

Late-look bias

A

Using clinic patient population, dead people are not included; less severe symptoms/cases/outcomes are over-represented

83
Q

Sample size considerations in clinical trials

A

If sample size is too small (underpowered), are exposing people to risk w/o reasonable prospect of an informative experiments (and also wasting resources and polluting the literature with a study affected w/ Type II error, which will delay further progress)

If sample size is too big (excess statistical power), you are exposing more people than necessary to risk than would result in a definitive study; and if new treatment is valuable, you are unnecessarily delaying progression to an available, effective treatment

84
Q

Randomization

A

Patients are randomized into treatment groups, so that high-risk and low-risk pts are just as likely to end up in one treatment as the the other treatment
Evens out the distribution of risk factors (even unknown risk factors)
Gets rid of conscious or unconscious bias in assignment of treatments

85
Q

Double-blind

A

Those assessing the results are unaware of which treatment was employed in the specific patient

86
Q

Phases of clinical trials

A

Phase I - Human pharmacology
Phase II - Therapeutic exploratory
Phase III - Therapeutic confirmatory
Phase IV - Therapeutic Use

87
Q

Cross-over design

A

Instead of being on either drug A or drug B, each person takes drug A and then drug B or vice versa

88
Q

NNT

A

“Number needed to treat”
How many ppl do you treat before you expect to effect one additional positive outcome?
To compute NNT, need to subtract the rate in the treatment group from the rate in the control group and then invert it (divide the difference into 1)

89
Q

When do you reject the null hypothesis?

A

When p < α

90
Q

Type I error

A

Also called α error

Declaring that there is something significant when there isn’t

91
Q

Type II error

A

Also called β error

Failing to detect that alternate hypothesis is true

92
Q

Statistical power

A

Ability to detect

= 1 - (β error rate)

93
Q

How does statistical power vary?

A

Increases with increased sample size
Decreases with increased standard deviation (bc more difficult to distinguish a difference b/w groups when there is increased inherent variability)

94
Q

Nominal measurement scale

A

Categorical data w/ no order

E.g. what is your gender? Male/female

95
Q

Ordinal measurement scale

A

Ranked or scaled data

A 4 doesn’t necessarily mean twice as bad as 2

96
Q

Numerical quantitative

A

Underlying continuous distribution

Weight, height, choelsterol

97
Q

Numerical discrete

A

Number of office visits, number of dental carriers

can’t have 1/2 office visit

98
Q

Median

A

(N+1) x 0.50 = position of the 50% value in an ordered array

This is a common percentile value

99
Q

Range

A

Difference b/w the largest and smallest values

100
Q

Interquartile range

A

Difference b/w the 75% and 25% values

101
Q

Standard deviation

A

Measurement of the deviation of each data point from the arithmetic mean of that set of data

102
Q

When are the mean, median, and mode the same?

A

When data is unimodal and symmetrical

103
Q

Right skew

A

Positive skew

Mean is larger than median

104
Q

Left skew

A

Negative skew

Median is larger than mean

105
Q

Choice of summary measures

A

Normally distributed data: Mean and SD

Skewed data: Median and interquartile range

106
Q

Choice of statistical tests

A

Normally distributed data: parametric tests

Skewed data: non-parametric tests

107
Q

Box and whiskers plot

A

Bottom of box is 25%
Top of box is 75%
Middle line is 50%
Interquartile range: top value (75%) minus the bottom value (25%) of the box
Whiskers extend from the ends of the box and to the outermost data points that all within the distance computed: quartile value +/- 1.5 x (interquartile range)

108
Q

Two types of box plots

A

Quantile (all % values) and outlier (w/ outliers)

109
Q

Sensitivity

A

P(T+ / D+)

Proportion of people who test positive given that disease positive

110
Q

Specificity

A

P (T- / D-)

Proportion of people who test negative given that disease negative

111
Q

Predictive value of a positive test

A

P(D+ / T+)
Proportion of people who are classified as positive by the screening test for a particular disease and who in fact have the disease

112
Q

Predictive value of a negative test

A

P(D- / T-)
Proportion of people who are classified as negative by the screening test for a particular disease and who in fact don’t have the disease

113
Q

What determines if a test is reliable?

A

Prevalence of disease in population

114
Q

Likelihood ratio for positive results from a test

A

LR positive = TPR/FPR = Sensitivity/(1-Specificity)
TPR = true positive ratio
FPR = false positive ratio

115
Q

Likelihood ratio for negative results from a test

A

LR negative = FNR/TNR = (1-Sensitivity)/Specificty
FNR = false negative ratio
TNR = true negative ratio

116
Q

What does an ROC plot tell you?

A

Which diagnostic test offers the best trade-off b/w true and false positives
Describes test performance in terms of area under the ROC plot
Allows comparison of different tests’ performance across a range of potential positivity criteria

117
Q

In an ROC plot, which curve is better?

A

Upper, leftmost curve