Biostats Flashcards

1
Q

Type I error

A

study incorrectly rejects a null hypothesis that is true. The rate of type I errors is denoted by α and usually reflects the significance level of a test. A higher α increases the likelihood of a type I error and decreases the likelihood of a type II error. The main effect of a smaller sample size is to increase the probability of a type II error rather than a type I error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Type II error in relation to null hypothesis

A

fails to reject a null hypothesis (H0) that is false.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what odds ratio means

A
  • measure of association between an exposure and an outcome. In this case, it represents the odds that an outcome (eg, major cardiovascular event) will occur in the presence of a particular exposure (eg, intensive statin therapy) compared to the odds of that outcome in a control group.
  • An OR >1 means that the exposure is associated with higher odds of the outcome and an OR <1 means that the exposure is associated with lower odds of the outcome.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

interpretation of negative likelihood ratio

A

negative likelihood ratio (LR-) represents the value of a negative test result. The smaller the LR, the less likely it is that the disease is actually present.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Pearson chi-squared test use

A

compare associations between categorical variables (eg gender)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

paired t-test use

A

test the difference between 2 paired means; patients serve as their own control (eg, mean blood pressure before and after treatment in the same subjects).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

standardized incidence ratio

A
  • measure used to determine if the occurrence of cancer in a small population is high or low relative to an EXPECTED value derived from a larger comparison population.
  • dividing observed cases (OC) by the expected cases (EC); the formula is SIR = OC / EC.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

standardized mortality ratio

A

adjusted measure of overall mortality and is calculated by dividing the observed number of deaths in the population of interest (eg, miners) by the expected number derived from the reference population (“standard”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

verification bias + how to avoid

A
  • Study uses gold standard testing selectively in order to confirm a positive (or negative) result of preliminary testing (eg., not feasible to biopsy everyone so some people get screened rather than biopsied). This can result in overestimates (or underestimates) of sensitivity (or specificity).
  • perform gold standard testing in a random sample of participants with negative results, as seen in this study on cervical cancer
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

selection bias

A

results from the manner in which study participants are selected or lost to follow-up. Randomization in a clinical trial reduces selection bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

observer bias

A

observer responsible for recording results is influenced by prior knowledge about participants or study details. Blinded studies (as in this case) usually avoid this bias by preventing observers from knowing which treatment or intervention the participants are receiving; this leads to a more objective measurement of outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

contamination bias

A

control group unintentionally receives the treatment or the intervention, thereby reducing the difference in outcomes between the control and treatment group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

attributable risk percent meaning

A

measure of excess risk. It estimates the proportion of the disease in exposed subjects that is attributed to exposure status.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Population attributable risk

A

estimates the proportion of disease in the population that is attributed to the exposure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

factorial study design (or fully crossed design)

A

type of experimental study design that utilizes >2 interventions and all combinations of these interventions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Pragmatic study

A

Seeks to determine whether an intervention works in real-life conditions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

cross-sectional study

A

type of observational study in which a specific population or group is studied at one specific point in time, therefore providing a cross section of the group at that particular time point.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Net clinical benefit measures

A

Measure of intervention’s possible benefit minus its possible harm.
eg (benefit (reduced risk of death from any cause/myocardial infarction/stroke) minus harm (increased risk of intracranial bleeding).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

reasons why you do intent to treat analysis

A
  • preserve randomization
  • avoid the effects of crossover and dropout, which may break randomization and affect the outcome. For example, if the sickest patients drop out at a higher rate, even an ineffective treatment may appear beneficial if analysis is performed on only those who finished the treatment.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

odds ratio vs. relative risk

A

odds ratio = case-control and cross-sectional studies

relative risk = cohort study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

prospective cohort study design

A

2 groups of subjects (ie, cohorts) are selected based on their exposure status (risk factor, no risk factor). The cohorts are then followed across time and the incidence of the disease (ie, postpartum depression [PPD]) is compared between groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

case control study design

A

two groups (1 diseased, 1 nondiseased), then you look back in time to compare risk factor frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

cross sectional study design

A

separate groups, positive and negative risk factor. Then compare disease prevalence in a single point in time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

best study design to investigate outbreak of infectious disease

A

most appropriate study design to investigate an outbreak of an acute infectious disease. It generally allows for quick localization of the outbreak source.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

other caveat with confidence intervals

A
  • overlapping areas may not imply statistical significance (so compare ranges of CI’s between two treatment groups. If overlapping you can’t say there’s a signficant difference between two treatment groups)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

type I error in short

A

false positive (rejecting a null hypothesis when the null hypothesis is true)

27
Q

case fatality rate

A

proportion of patients with a particular disease who die from the disease

28
Q

point of Time-to-event data

A

Data collected when the elapsed time before an event occurs is of significant interest. It accounts not only for the total number of events in both groups, but also for the timing of the events, which can be important when assessing treatment benefit. It is commonly used in survival analysis where the event of interest is death.
- for instance patients in the treatment group may, on average, have a longer survival time than the patients in the control group despite an equivalent 2-year mortality risk. Thus with time-to-event data you can say the intervention is statistically significant even i f mortality risk is unchanged.

29
Q

how to reduce effect modification

A
  • Stratification (report separate measures of outcome for each level of an effect modifier)
30
Q

standard deviation rule

A

68-95-***99.7

31
Q

use of multiple logistic regression

A

used to estimate the association between ≥2 independent variables and 1 dichotomous dependent variable (ie, with 2 outcomes). For example, this regression model would evaluate the association between presence or absence of type 2 diabetes mellitus (dichotomous dependent variable) and obesity while adjusting for age and gender (3 independent variables).

32
Q

healthy worker effect and how to prevent

A

special type of selection bias that usually occurs in occupational cohort studies when the general population is used as the reference group. The general population consists of healthy and unhealthy individuals; those who are unhealthy are less likely to be employed, whereas the employed workforce tends to have fewer sick individuals. Consequently, comparisons of mortality rates between an employed population and the general population are usually biased.

33
Q

to remember when calculating incidence

A

Subtract the existing prevalence of disease from the denominator

34
Q

contingency table set-up

A

Diseased, nondiseased on top

Exposure, nonexposed on left

35
Q

graph of correlation coefficient of 1

A

positive linear

36
Q

meaning of coefficient of determination

A

Expresses the percentage of the variability in the outcome factor that is explained by the predictor factor (if correlation coefficient is -0.8 between homocystein level and folic acid intake, 0.64 or 64% of variability is explained by folic acid intake)

37
Q

how to calculate coefficient of determination

A

square correlation coefficient

38
Q

question testing correlation coefficient explanation

A

suggests association between two variables, does not imply causation

39
Q

absolute risk reduction

A

difference in risk between two groups

40
Q

interpretation of p value

A
  • represents probability that the null hypothesis is true

- p-value of 0.01 means there is a 1% chance that there is no association between variables

41
Q

meaning of a value of 1.0 for odds ratio or relative risk

A

no association between exposure and outcome

42
Q

relation between confidence interval and p value

A

Assuming, CI does not include 1.0, a CI of 95% suggests there is LESS than a 5% chance that association is due to chance. Thus, p value must be a number below 5.

43
Q

what to notice about CI

A

width indicates sample size, with narrower CI suggesting bigger sample size

44
Q

meaning of standard error of the mean

A
  • ## Shows how precisely the sample represents the study population.
45
Q

standard error of mean calcuation

A

SD/square root of sample size

46
Q

relation of standard error of mean to sample size

A
  • inverse (as sample increases, standard error of mean decreases)
47
Q

TP value in positive predictive value equation

A

TP = number of positives yielded by test. Not actual number of positives.

48
Q

length-time bias

A

screening test preferentially detects less aggressive forms of a disease and therefore increases apparent survival time.

49
Q

control group in case control studies

A

group lacks disease and has variable level of exposure (you don’t choose group based on exposure status)

50
Q

selection bias is generally defined as

A

study population not representative of true population

51
Q

susceptibility bias

A

treatment regimen for a patient depends on the severity of the patient’s condition

52
Q

In order for something to be considered as confounding it must…

A

be related to the variable under study

53
Q

meaning of Z score

A

indicates how many standard deviations a given value is from the mean

54
Q

process of calculating Z score

A

Subtract mean from all values, then divide by the standard deviation

55
Q

use of two-sample t test

A

compare means of two independent groups

56
Q

use of paired t test

A

comapre means of two DEPENDENT groups (often two means from the same individual, baseline BMI and BMI after treatment)

57
Q

typical chi-square test question

A

shows table in a contingency table, categorized by headers (high, or normal) and then presented with expsoure

58
Q

Fisher’s exact test

A

Categorical data, just like chi-square test but for smaller sample size (value in any cell less than 10)

59
Q

meaning of alpha

A

alpha = probability of committing a type 1 error

- equals p value

60
Q

meaning of beta

A

beta = probability of committing type II error

- power of the study

61
Q

type I error

A

Wrongfully concluding that there is an association between exposure and disease when in fact there is none.

62
Q

type 2 error

A

wrongfully concluding that there is no association between exposure and outcome, when in fact there is one

63
Q

validity of study pertains to..

A

flaws in study design and or analysis (not affected by sample size)