Epi Methods 753 Flashcards

1
Q

2 categories where 1 is reference group (typically “unexposed”)

A

Dichotomous Variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Parameterization of variable into discrete categories

A

Categorical Variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Categorical variable that assigns 0 or 1

A

Binary Variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Categorical variable that doesn’t have ordering/order not of interest; collection of k-1 binary indicator variables

A

Nominal Variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Categorical variable that has ordering/order of interest; collection of binary variables assigned score; step between categories constrained to be equal

A

Ordinal Variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Test Ho that B=0, where B is coefficient for category score variable; if p<0.05 best estimate for step from one category to next is different from 0

A

Mantel Test for Trend

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Variable can take any value between lower & upper limit

A

Continuous Variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Divide continuous variable by factor; coefficient of variable affected

A

Rescaling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Subtract continuous variable by factor; intercept affected

A

Centering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Closely related to counterfactual; compare observed outcome to non-observed (counterfactual) outcome; estimate measures of causal effect by measures of association assuming exchangeability (differences due to confounding)

A

Potential Outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Observed outcomes in unexposed are good stand-in for unobserved potential outcomes for exposed persons under no exposure & vice versa; not testable but met in expectation with randomization

A

Exchangeability Assumption

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Comparison of pre-treatment covariates in exposed & unexposed groups; comparability doesn’t guarantee assumption met

A

Exchangeability Assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Relax exchangeability assumption to be conditional on covariates; assumes no unmeasured confounders

A

Conditional Exchangeability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Used to assess how average value of continuous outcome varies systematically with X’s; E[Y] = B0+B1X1+…; B1=average difference (cross-sectional) or change (longitudinal) in Y per 1-unit X1

A

Linear Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Used to assess how log odds binary outcome varies systematically with X’s; log(odds Y)=B0+B1X1+…; B1=difference in log(odds Y) per 1-unit X1; PrOR for cross-sectional or ROR for longitudinal

A

Logistic Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Risk or prevalence > 10%

A

OR Overestimates RR or PrR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Used to assess how log probability binary outcome varies systematically with X’s; log(Pr(Y=1))=B0+B1X1+…; B1=difference in log(prob Y) per 1-unit X1; PrR for cross-sectional or RR for longitudinal

A

Log-Binomial Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Path from E to O that starts with E & all arrows point in same direction

A

Causal Path

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Any other path from E to O; unconditionally open backdoor paths are confounded vs. unconditionally closed backdoor paths are blocked at collider

A

Non-Causal Path

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Covariate set that leaves all causal paths open & non-causal paths closed vs. does this without any extra variables

A

Sufficient vs. Minimally Sufficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Variables only causally associated with exposure; decreases precision if put into model

A

Instrument

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Not necessary for confounder control but may increase precision

A

Variables Associated with Outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Confounding is causal concept but collapsibility is statistical concept; depends on prevalence of outcome & type of measure of association

A

Problems with Collapsibility Definition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Stratify table by exposure, do not include outcome, & do not include p-values

A

Causal Inference Table 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Give similar results when number of confounders is small & no confounders are continuous

A

Stratified Analysis vs. Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Expresses incomplete adjustment of confounding variables due to mismeasurement or misspecification

A

Residual Confounding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Prioritize accurate representation & interpretation of exposure but fit for confounders

A

Causal Inference Modeling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

2 or more risk factors modify effect of each other with regard to occurrence/level of outcome; effect of E on O differs across strata of X; potential outcomes indexed by E only & estimated conditional on X (1 exchangeability assumption)

A

Effect Measure Modifier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Risk of O in presence of both E & X differs from what would be expected based on effect of E alone & X alone; potential outcomes indexed by both E & X (2 exchangeability assumptions)

A

Causal Interaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Difference of risk differences (liner), ratio of odds ratios (logistic), or ratio of risk ratios (log-binomial)

A

Coefficient of Product Term

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Difference of risk differences expressed as proportion of reference risk (RR00); R11-R01-R10+1

A

Relative Excess Risk due to Interaction (RERI)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

P-value of Wald test for interaction coefficient; LRT (or F-test for linear models); underpowered & likely to return false positives

A

Test of Homogeneity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Used to identify potential associations with outcome; hypothesis generating; non-causal, potential for multiple comparisons, different across studies

A

Risk Factor Analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Stratify table by outcome, no p-values

A

Risk Factor Analysis Table 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Prioritize interpretability & model fit of all covariates

A

Risk Factor Analysis Modeling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Assign predicted probability of condition based on baseline characteristics; use logistic regression then convert to probability

A

Prediction Model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Use baseline characteristics to predict current disease stage; useful if gold standard is invasive or expensive; assessed against gold standard

A

Diagnostic Model

38
Q

Use baseline characteristics to predict future disease state; assessed by future outcomes data

A

Prognostic Model

39
Q

Degree of closeness of measured/predicted quantity to actual/gold standard value

A

Accuracy

40
Q

Accuracy of output from prediction model applied to data used to develop model; calibration & discrimination

A

Model Accuracy

41
Q

Accuracy of output from prediction model applied to data not used to develop model; calibration & discrimination

A

Model Predictive Accuracy

42
Q

Ability to correctly estimate disease state or risk/probability of future event

A

Calibration

43
Q

Ability to separate persons with/without disease or various disease states

A

Discrimination

44
Q

Prioritize model fit & parsimony

A

Prediction Modeling

45
Q

Continuous outcome; R2 closer to 1 vs. intercept = 0 & slope = 1

A

Good Discrimination vs. Calibration

46
Q

Plot sensitivity vs. 1-specificity for binary outcome; each point corresponds to different cutoff of what defines “positive test”

A

ROC Curve

47
Q

Area under ROC curve; if one person with & one without disease were randomly selected, probability that person with disease has higher predicted probability

A

C-Statistic

48
Q

Measure of calibration for binary outcomes; measures closeness of distributions of observed & predicted values; tests Ho that observed=expected, p<0.05 indicates poor fit

A

Hosmer-Lemeshow X2 Goodness-of-Fit Test

49
Q

Stratify by dataset, no p-values

A

Prediction Table 1

50
Q

Describes models whose output reflects statistical “noise” in particular dataset rather than underlying, stable relationships that may be reproducible

A

Overfitting

51
Q

Correlation between predictors high enough to degrade precision of regression coefficient estimates substantially for some/all correlated predictors; do not tolerate VIF > 10

A

Collinearity

52
Q

Measurement of predictive accuracy; discrimination/calibration of model on data not used to derive model

A

Validation

53
Q

Predictive accuracy measured within same population (training set vs. validation set); e.g. split sample or h-fold cross-validation

A

Internal Validation

54
Q

Predictive accuracy measured within different population

A

External Validation

55
Q

2-state, non-recurrent event

A

Outcome for Survival Analysis

56
Q

Reflects beginning of time individuals biologically & methodologically at risk; elapsed time measured from this point (aligns individuals)

A

Time Origin

57
Q

Yardstick by which time is measured; controls for that measurement of time

A

Time Metric

58
Q

Time at beginning of individual’s observation in study

A

Entry Time

59
Q

Time origin < study entry; assume individuals representative of all other participants & those who don’t enter at all

A

Late Entry

60
Q

Time during which study outcome cannot occur because individual not under observation; downwardly biased outcome rate & upwardly biased survival curve

A

Immortal Person-Time

61
Q

Exclusion of prevalent cases

A

Left Censoring

62
Q

Individual did not experience outcome under follow-up & can’t be further observed (no longer methodologically at risk); administrative censoring, LTFU, or competing risk; assumed to be non-informative

A

Right Censoring

63
Q

Assumption that risk of outcome at any given moment of follow-up is similar across individuals

A

Equivalence of Person-Time at Risk

64
Q

Group of individuals aligned by time origin & at risk for event at time t; used for comparisons in survival analysis; assembled at each time of event (continuous) or period (discrete)

A

Risk Set

65
Q

Instantaneous rate of event among those who survive without event to that time point among those who make it to time point; estimated using p(t)/width

A

Continuous Time Hazard

66
Q

Conditional probability of event among those who survive without event to that time period among those who make it to time period; # events/#at risk; determines whether risk is increasing, decreasing, or constant

A

Discrete Time Hazard

67
Q

Cumulative probability of surviving beyond time j; S(tj-1)(1-h(tj)) or S(tj-1)(1-p(tj)); plot using Kaplan-Meier

A

Survival

68
Q

Cumulative probability of having event at or before time j; complement of survival function; plot using Kaplan-Meier

A

Cumulative Incidence

69
Q

Cumulation of hazard between t0 & tj for individual; shape represents behavior of hazard function in continuous time; estimated using Kaplan-Meier –> plot -ln(S(tij))

A

Cumulative Hazard

70
Q

One record for each person-period when individual at risk (often multiple rows of data per person); define late entries, exclude person-time prior to study entry or after study exit, & identify gaps

A

Discrete Time Data Setup

71
Q

Models discrete-time hazard function for truly discrete hazard; log hazard odds=[aD1+…]*BXi; aj=log hazard odds for time period j when X’s=0 (estimates hazard in each time period); B=log hazard OR in exposed vs. unexposed

A

Pooled Logistic Regression

72
Q

Truly discrete hazard (hazard is conditional probability & constant within each time period) & proportional hazard odds (hazard OR constant across periods)

A

Assumptions of Pooled Logistic Regression

73
Q

Models discrete-time hazard function for underlying continuous event processes; ln(-ln(1-h(tij|Xij))=[a1D1+…]*BX1; B=log HR outcome in exposed vs. unexposed

A

Discrete Time Proportional Hazards Regression (cloglog)

74
Q

Continuous-time hazard & proportional hazards

A

Assumptions of Discrete Time Proportional Hazards Regression

75
Q

One record for each individual (can be multiple if time-varying covariates); define late entries & exclude person-time prior to study entry or after study exit

A

Continuous Time Data Setup

76
Q

Models continuous-time hazard function; log(h(t))=log(h0(t))+B1X1+…; B1=log HR outcome in exposed vs. unexposed; semi-parametric; sensitive to ties

A

Cox Proportional Hazards Regression

77
Q

Shape of hazard allowed to vary & proportional hazards

A

Assumptions of Cox Proportional Hazards Regression

78
Q

Parallel lines for plot H(t) vs. time or ln(H(t)) vs. time; horizontal line or correlation of 0 for plot of Schoenfeld residuals vs. time

A

Assessing Proportional Hazards Assumption

79
Q

Tests Ho of no difference between survival functions; p<0.05 indicates survival differs in at least 1 group

A

Log-Rank Test

80
Q

Unit of analysis is time period in which variable is constant; include additional rows of data for each transition time

A

Time-Varying Covariates

81
Q

Conditional logistic regression to calculate matched OR (discordant pairs); rare disease assumption met & OR is valid estimate of HR (representative subsample & cohort is reasonable size)

A

Analysis for Nested Case-Control Studies

82
Q

Cox proportional hazards regression with late entries for cases outside subcohort; rare disease assumption met (cohort reasonable size & few ties)

A

Analysis for Case-Cohort Studies

83
Q

Used to assess how log incidence rate of count outcome varies systematically with X’s; log(IRk)=uj+B0+B1X1+… or log(A)=uj+B0+B1X1+…+log(T); B1=difference in log(IR Y) per 1-unit X1 (same as log IRR)

A

Poisson Regression

84
Q

Equivalent to hazard when hazard is constant or average hazard when hazard isn’t constant

A

Incidence Rate

85
Q

Communication, no meaningful time origin, no multi-level data, or outcome is count

A

Reasons to Estimate IR

86
Q

Offset

A

ln(person-time) in Poisson Regression

87
Q

Each row corresponds to one bin of person-time; each row needs covariate(s) values, # events, & amount of person-time

A

Poisson Data Setup

88
Q

Constant multiplicative effect, constant average hazard, mean=variance

A

Assumptions of Poisson Regression

89
Q

Used to assess how log incidence rate of count outcome varies systematically with X’s; relaxes mean=variance assumption using dispersion parameter (a); log(IRk)=uj+B0+B1X1+… or log(A)=uj+B0+B1X1+…+log(T); B1=difference in log(IR Y) per 1-unit X1 (same as log IRR)

A

Negative Binomial Regression

90
Q

LRT with Ho: a=0; if p<0.05 then use NB

A

Evaluating Overdispersion

91
Q

Variance>mean in dataset where outcome assumed to be Poisson distributed; may occur if confounder not included in model or outcomes correlated across time bins; can produce underestimated SE & overestimated test statistics

A

Overdispersion