Biostats Flashcards

1
Q

Cross-sectional study

A

Collects data from a group of people to assess disease frequency at a particular point in time
May show risk association, but not causality
“What’s happening?”
Measures prevalence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Case-control study

A

Compares group with disease to a group without disease
Looks for prior exposure/risk
Retrospective
“What happened?”
Measures odds ratio: OR = [(a/c)/(b/d)] = (ad)/(bc)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Cohort study

A

Compares initially disease-free people in two groups to see who develops disease: one with exposure/risk, and one without exposure/risk
Can show if exposure/risk increases disease likelihood
Retrospective OR prospective
“Who will develop/developed disease?”
Measures relative risk: RR = [a/(a+b)]/[c/(c+d)]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Twin concordance, adoption studies

A

Measure heritability and environmental influence
Mono- vs dizygotic twins
Siblings with biological vs adoptive parents

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Clinical trial phase goals

A

I: Is it safe?
II: Does it work?
III: Is it as good or better than current treatments?
IV: Can it stay?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Odds ratio

A

Odds that a group with disease was exposed to a risk divided by the odds that the group without the disease was exposed
OR = (a/c)/(b/d) = (ad)/(bc)
Typically used for case-control studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Relative risk

A

Risk of developing disease in the exposed group divided by risk in the unexposed group
RR = [a/(a+b)]/[c/(c+d)]
Typically used in cohort studies
If prevalence is low, OR ~ RR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Attributable risk

A

Difference in risk between exposed and unexposed groups, i.e. proportion of disease occurrences attributable to an exposure
AR = a/(a+b) - c/(c+d)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Relative risk reduction

A

Proportion of risk reduction attributable to an intervention as compared to a control
RRR = 1 - RR = 1 - [a/(a+b)]/[c/(c+d)]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Absolute risk reduction

A

Difference in risk attributable to the intervention as compared to the control
ARR = c/(c+d) - a/(a+b) = -AR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Number needed to treat

A

NNT = 1/ARR (treat has more letters than harm)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Number needed to harm

A

NNH = 1/AR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Bias in recruiting participants

A

Selection, sampling, referral, allocation bias
E.g. Berkson bias - study population is from a hospital and less healthy than the general population
Healthy worker effect - (opposite of Berkson)
Non-response - nonrespondents differ from participants meaningfully
Randomize to reduce

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Procedure bias

A

Subjects in different groups are not treated the same
Includes detection bias: Those with a risk factor undergo greater diagnostic scrutiny than those without the risk
Use blinding and placebos to reduce

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Recall bias

A

Awareness of disorder alters recall by subjects
Common in retrospective studies
Decrease time from exposure to follow-up to reduce

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Observer-expectancy bias

A

Researcher’s belief in a treatment’s efficacy changes outcomes
AKA Pygmalion effect or self-fulfilling prophecy
Use blinding and placebos to reduce

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Confounding bias

A

Factor is related to both exposure and outcome, but not the causal pathway
Reduce with multiple/repeat studies, matching of patients with similar characteristics in both control and treatment groups, crossover studies where subjects act as their own controls

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Lead-time bias

A

Early detection is confused with increased survival
Especially important for studies of long-term chronic disease
Reduce by measuring back-end survival by controlling for disease severity at time of diagnosis

19
Q

Hawthorne effect

A

AKA observer effect

Subjects tend to change their behavior when they know they’re being observed

20
Q

alpha definition

A

Probability of making a type I error (finding a difference between control and experimental groups when one does not exist)

21
Q

beta definition

A

Probability of making a type II error (stating there is no difference between control and experimental groups when one does exist)
beta increases as alpha decreases

22
Q

Power

A

1 - b

Increases as beta decreases: Increased precision, increased effect, or INCREASED SAMPLE SIZE

23
Q

t-test

A

Checks differences between the MEANS OF 2 GROUPS

E.g. BP between males/females

24
Q

ANOVA

A

Checks differences between the MEANS OF AT LEAST 3 GROUPS

E.g. BP between members of 3 ethnic groups

25
Q

Chi-square test

A

Checks differences between 2 or more PERCENTAGES OR PROPORTIONS OF CATEGORICAL OUTCOMES
E.g. Percentage of members of 3 ethnic groups with HTN

26
Q

Ordinal data

A

Data ordered by a position on a scale
Usually categorical - cannot perform arithmetic with these
E.g. Runners finishing in 1st, 3rd, 5th places
Qualitative - Non-parametric

27
Q

Interval data

A

Data measured along a scale in which each position is equidistant
Quantitative - Parametric
Allows for distances between data points to be equivalent in a way
E.g. Happiness scale from 1-10 or Runners finishing a 5k between 18:00-18:59, 19:00-19:59, 20:00-20:59, etc.

28
Q

Nominal data

A

Data differentiated by a simple naming system
Usually categorical - E.g. “employee”
May have a number assigned, but is not ordinal (E.g. Runner’s ID number or an athlete’s jersey number)
Qualitative - Non-parametric

29
Q

Ratio data

A

Data in which numbers are multiples of each other and can be mathematically compared. Zero has a meaning on the scale used for this data
E.g. Runner’s finishing time for a race
Quantitative - Parametric

30
Q

Continuous data

A

Measured along a continuous scale allowing for infinitely fine subdivision
Vs. discrete where data falls into bins like with interval data

31
Q

Parametric data

A

Quantitative, forms predictable distributions (e.g. normal)

Can use arithmetic to gain insight into the datasets

32
Q

Non-parametric data

A

Qualitative, does not assume any distribution

33
Q

Likelihood ratio for a positive test

A

Sensitivity/(1-Specificity)

34
Q

Likelihood ratio for a negative test

A

(1-Sensitivity)/Specificity

35
Q

Sensitivity

A

Chance a test detects disease when it is present
(True-positive rate)
a/(a+c)
TP/(TP+FN)

36
Q

Specificity

A

Chance a test indicates no disease when none is present
(True-negative rate)
d/(b+d)
TN/(TN+FP)

37
Q

Positive predictive value

A

Proportion of positive test results that are true positives
a/(a+b)
TP/(TP+FP)

38
Q

Negative predictive value

A

Proportion of negative test results that are true negatives
d/(c+d)
TN/(TN+FN)

39
Q

Incidence

A

New cases occuring during a particular time period

N(new cases)/N(at risk)

40
Q

Prevalence

A

Number of people affected by a disease at a given point in time
N(w/disease)/N(population)
Increases w/ incidence
Decreases w/ death of affecteds and recovery

41
Q

Standard error of the mean

A

Used for samples of a population

SEM = s/sqrt(n), where s = stddev of sample

42
Q

Correlation coefficient

A

r
Always between -1 and 1
More negative = stronger negative correlation, etc.

43
Q

Coefficient of determination

A

r^2
Always between 0 and 1
Represents the amount of variance in the dependent variable (y) due to the independent variable (x):
y = a + bx