General Epi Flashcards

1
Q

What is Berkson’s Bias?

A

Selection bias in case-control studies conducted within hospitals due to the manner in which risks of hospitalization can combine in patients who have more than one condition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

repeated measures

A

when subjects are measured at
multiple time points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

cluster randomized trials

A

performed to assign interventions to groups of people rather than to individual subjects (for
example, schools, classrooms, cities, clinics, or communities)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

If a statistical test assuming observations are independent is used with correlated observations in analyzing within-subject or within-cluster effects, what happens to the p-values?

A

overestimation of the P-values - decreasing the statistical power and increasing the type II error rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

If a statistical test assuming observations are independent is used with correlated observations in analyzing between-subject or between-cluster effects, what happens to the p-values?

A

underestimation of the P-values - increase in statistical power and increase in type I error rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What test?

Continuous or ordinal, non-normally distributed, independent

A

Wilcoxon rank sum test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What test?

Continuous or ordinal, non-normally distributed, correlated

A

Wilcoxon signed rank test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What tests?

Continuous, normally distributed, independent

A

Two-sample t-test
ANOVA
Linear regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What tests?

Continuous, normally distributed, correlated

A

Paired t-test
Repeated-measures ANOVA or
Mixed models or
hierarchical linear models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What tests?

Binary/categorical , independent

A

X2 Test (Chi-squared) or
Fishers Exact Test or
Logistic Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What tests?

Binary/categorical , correlated

A

McNemar X2 test (for 2x2 data) or
McNemar exact test (for 2x2 data) or
Conditional logistic regression or
generalized estimating equations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is the logit function?

A

ln(p/1-p) - aka log odds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

another name for log odds

A

logit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what do you use to hypothesis test in a logistic regression model

A

Wald and Wald chi-squared for single Betas and
Likelihood Ratio Test or testing ALL Betas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

conditional exchangeability

A

Conditional exchangeability essentially means that, even if there are confounding variables that differ between the treatment and control groups that affect the outcome, if we only look at individuals who take a single value for that confounding variable, then the treatment assignment within each strata is “as if” random.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

ITT

A

intention to treat - ITT analysis includes every subject who is randomized according to randomized treatment assignment. It ignores noncompliance, protocol deviations, withdrawal, and anything that happens after randomization. ITT analysis maintains prognostic balance generated from the original random treatment allocation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

compare and contrast iptw vs propensity score matching

A

propensity score matching discards some samples while IPTW uses all samples

propensity scores reduce the dimensionality of the data/counfounders into one metric which is beneficial, but limits the ability to analyze the effects of specific confounders

matching may not perfectly balance confounders, as there are many ways to generate similar scores

IPTW - extreme measures have outsized impacts - the tails can bias the results- matching discards these

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are the 8 assumptions for MSM’s?

A
  1. time ordering (Exposure precedes the outcome
    Confounders precede exposure and outcome)
  2. no interference (An individual’s counterfactual outcome under treatment
    does not depend on others’ treatment values)
  3. Consistency Assumption (The observed outcome is one of all possible counterfactual
    outcomes)
  4. No unmeasured confounding/conditional
    exchangability/Ignorable treatment assignment
  5. Experimental Treatment Assignment/Positivity (you need to observe all levels of treatment within each stratum of
    the covariates in the real data)
  6. Correct model specification
  7. No selection bias (Selection bias limits the ability to make inference to your target
    population and may distort your estimates of effect)
  8. No measurement error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How can selection bias your analysis?

A

This can affect the magnitude and even direction of the
estimate of effect

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

positivity assumption

A

determining if for any value of
covariates, the probability of treatment was either 0 or 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How can you correct for a positivity assumption violation?

A

Drop persons who violate positivity/restrict to population in which there are both treated/untreated
◦ Limitation is that this may limit generalizability and induce selection bias

Try dropping variables from treatment model to determine if 1 or 2 variables is driving violation
◦ If variable is not a possible confounder, remove from treatment model
◦ If variable is a possible confounder, consider “coarsening” the variable
classification
◦ E.g. transform linear variable into quartiles; collapse categories/levels
of a variable
◦ Limitation is that this will may induce some residual confounding
An alternative is to use a different estimator (g-computation; double robust)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

region of common support

A

in propensity score matching, it’s the region that includes both treated and untreated individuals - members out of this region cannot be matched

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What can you do for a linear in the logit violation?

A

`1. transform the variable (log, square, z-score)

  1. discretize / categorize the variable
  2. convert to ordinal - same as above
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

If your results are not linear in the logit, what does that mean?

A

It means the relationship between your predictors and outcomes may be quadratic, exponential or some other form

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

If something is not linear in the logit, and you use the results anyway, what is likely to occur?

A

The model is looking for the best fit line, so it will give you that, but your confidence intervals will be huge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

In logistic regression, if the residuals are non-linear, what does this mean?

A

Possible that predictors need to be modeled as quadratic or similar.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

How do you get an overall p-value for interaction in multivariable logistic regression?

A

Use the likelihood ratio test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

How do you get an overall p-value for interaction in multivariable logistic regression?

A

Use the likelihood ratio test - compare model that has interactions with a model that doesn’t

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What % difference in Beta coefficients does Kristin recommend to identify confounding or interaction?

A

10% diff in B coefficients

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Can p-values be used to determine if something has an interaction or is a confounder?

A

NO! Only Beta coefficients/OR effect sizes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is residual confounding?

A

Confounding that remains even after adjustment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What’s the general effect range (in OR) that can sometimes be attributed to residual confounding?

A

.6 to 1.6

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Can residual confounding obscure relationships?

A

Yes!

34
Q

What is interaction?

A

When an effect size is significantly different in subgroups

35
Q

Are effect modification and interaction the same thing?

A

YES!

36
Q

If an interaction variable has a huge Beta coefficient, but is not significant, is that evidence of an interaction?

A

NO!

37
Q

What is a bootstrap sample?

A

A sample WITH REPLACEMENT of size n from a dataset of size n

38
Q

What’s the SE for a logit (ln(OR))

A

(1/a+1/b+1/c+1/d)^1/2

39
Q

If you take 200 bootstrap samples of n=300 observations, how many will you sample per boostrap?

A

300

40
Q

Is bootstrap with or without replacement?

A

with

41
Q

T or F: The bootstrap allows you to do repeated sampling and thus empirically estimate the distribution of almost any statistic

A

TRUE

42
Q

Approx what percent of original observations will be left out of any bootstrap sample?

A

36.8%

43
Q

What is 10 cross fold validation?

A

Fit model on 9/10ths of the data and validate against remaining 1/10th. Do this 10 times. Summarize (average area under curve)

44
Q

What’s the simple equation for OR in a 2x2

A

(a*d)/(b*c)

45
Q

What measure of association is used in a case-control study?

A

Odds Ratios

46
Q

What measure of association is used in a cohort study?

A

Risk Ratios

47
Q

Can you ascertain incidence from a case-control study?

A

no!

48
Q

Can you ascertain incidence from a cohort study?

A

yes!

49
Q

What is the rare disease assumption?

A

RR ~= OR in diseases w/ low incidence

50
Q

What’s a common power size?

A

.8 or 80%

51
Q

What does 80% power mean?

A

We have an 80% probability of correctly rejecting the the null hypothesis.

52
Q

What is the alpha in power calcs?

A

It’s the threshold of significance, typically .05 - the p-value you seek

53
Q

What is the overlap between distributions effected by?

A

The means and standard deviations

54
Q

What’s the effect size? (d)

A

(estimated difference in means) / (pooled estimated standard deviations)

55
Q

Equation for pooled estimated standard deviations

A

((S^2 +S^2)/2)^1/2

56
Q

How do you find the means and standard deviations for power calcs?

A

estimate w/ prior data
literature search
educated guess

57
Q

Once your have the parameters, how do you do power calcs/sample size calcs?

A

Google “statistics power calculator”

58
Q

What is C in he below DAG

A–> C <– B

A

A collider

59
Q

What’s a DAG structure for a collider?

A

A–> C <– B

60
Q

If there is a collider between A and B, are they independent?

A

YES!

61
Q

What happens if we condition on G below?

A<—- G —–> B

A

we break the association between A and B / we block the path?

62
Q

What happens if we condition upon G below?

A —–> G < ——B

A

We unblock G and create an association between A and B

63
Q

What is a backdoor path?

A

A path that from treatment to outcome that go into the treatment

64
Q

Is A <—X —>Y a backdoor path from A–>Y

A

YES!

65
Q

Is A –> X —> Y a backdoor path from A –>Y

A

NO - X is just a mediator

66
Q

What is the backdoor path criterion?

A

a set of variables X is sufficient to control for confounding if

a. it blocks all backdoor paths from treatment to outcome

AND

b. it does not include any descendants of treatment

67
Q

How many backdoor paths are there in this picture?

A

2

68
Q

What sets can you control on to block the backdoor paths?

A

V

VZ

ZW

VZW

69
Q

What is wrong with controlling for Z here?

A

It’s a collider, so it opens up a path from W to V

70
Q

What does instrumental variable analysis do?

A

Exploits ‘natural experiments’ to address unmeasured and residual confounding

71
Q

What is an instrumental variable?

A

A naturally occuring phenomenom that imperfectly randomizes people to an exposure or treatment

72
Q

What are the assumptions of IVA (instrumental variable analysis)?

A
  1. Instrument must be related to exposure or treatment
  2. IV must be unrelated to confounders
  3. IV must have no direct effect on the outcome, except its effect on the exposure or treatment
73
Q

What is the C statistic in propensity score matching?

A

The probability of a treated individual being identified as such by the propensity score.

74
Q

What’s a way to test for interaction in a stratified dataset?

A

The Breslow-Day test

75
Q

Equation for Odds ratio in 2x2

A

(a*d)/(b*c)

76
Q

Equation for risk ratio in 2x2

A

(A/A+B) / (C/C+D)

77
Q

Chi squared formula

A
78
Q

Equation for attributal risk in exposed (AR)

A

(A/(A+B)) - (C/(C+D))

Incidence in Exposed - Incidence in Unexposed

79
Q

Equation for attributal risk (AR) %

A

(Incidence in exposed - Incidence in unexposed) / (Incidence in exposed)

80
Q

Equation for population attributed risk (PAR)

A

(Incidence in Population - Incidence of Unexposed ) / Incidence in Population

OR

(A+B)/(A+B+C+D) - (C/(C+D))

/ (A+B)/(A+B+C+D)

81
Q

How do you assess additive interaction in a cohort study?

A

Compare the expected and observed AR (attributed risk)

82
Q

How do you assess multiplicative interaction in a cohort study?

A

Compare observed and expected RR (risk ratios / relative risk)