BioStat Flashcards

1
Q

STEPS TO JOURNAL PUBLICATION

A
  1. Begin with research question: Write a null hypothesis
  2. Design the study: Is it randomized, placebo-controlled, a case-control or other?
  3. Enroll the subjects
  4. Collect data: prospective (going into the future) or retrospective (back in time)
  5. Analyze data
  6. Publish!
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two main types of study data

A
  • Continuous Data
  • Discrete (Categorical) Data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Continuous Data: what is it? What are the two datas?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Discrete (Categorical) Data: what is it? What are the two datas?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Measures of Central Tendency

Mean, Median, Mode

A

Mean: the average value; it is calculated by adding up the values and dividing the sum by the number of values. The mean is preferred for continuous data that is normally distributed
Median: the value in the middle when the values are arranged from lowest to highest. When there are two center values (as with an even number of values), take the average of the two center values. The median is preferred for ordinal data or continuous data that is skewed (not normally distributed).
Mode: the value that occurs most frequently. The mode is preferred for nominal data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Guassian (normal) Distribution: characteristic

A

When the distribution of data is normal, the curve is symmetrical (even on both sides), with most of the values closer to the middle. Half of the values are on the left side of the curve, and half of the values are on the right side. When data is normally distributed:
■ The mean, median and mode are the same value, and are at the center point of the curve.
■ 68% of the values fall within 1 SD of the mean and 95% of the values fall within 2 SDs of the mean.
.
The examples show how the curve of normally distributed data changes based on the spread (or range) of the data. The curve gets taller and skinnier as the range of data narrows. The curve gets shorter and wider as the range of data widens (or is more spread out).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

SKEWED DISTRIBUTIONS: Data that are skewed do not have the characteristics of a normal distribution; the curve is not symmetrical. Outliers (Extreme Values) and Skew Refers to the Direction of the Tail

A

An outlier is an extreme value, either very low or very high, compared to the norm. For example, if a study reports the mean weight of included adult patients as 90 kg, then a patient in the same study with a weight of 40 kg or 186 kg is an outlier. When there are a small number of values, an outlier has a large impact on the mean and the data becomes skewed. In this case, the median is a better measure of central
tendency.
.
Skew:
Data is skewed towards outliers. When there are more low values in a data set and the outliers are the high values, data is skewed to the right (positive skew). When there are more high values in the data set and the outliers are the low values, the data is skewed to the left (negative skew).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

DEPENDENT AND INDEPENDENT VARIABLES

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

THE NULL HYPOTHESIS (H0) AND ALTERNATIVE HYPOTHESIS (HA)

A

The null hypothesis states that there is no statistically significant difference between groups. In a study comparing a drug to a placebo, the null hypothesis would assert that there is no difference in efficacy between them (drug efficacy = placebo efficacy). The researcher aims to disprove or reject this hypothesis.

The alternative hypothesis, on the other hand, posits that there is a statistically significant difference between the groups (drug efficacy ≠ placebo efficacy). This is what the researcher hopes to prove or accept.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

ALPHA LEVEL: THE STANDARD FOR SIGNIFICANCE

A

When investigators design a study, they select a maximum permissible error margin, called alpha (a). Alpha is the threshold for rejecting the null hypothesis. In medical research, alpha is commonly set at 5% (or 0.05).
.
The p-value is compared to alpha. If the alpha is set at 0.05 and the p-value is less than alpha (p < 0.05), the null hypothesis is rejected, and the result is termed statistically significant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Interpreting CI

A

The values in the CI range are used to determine whether signficance has been reached

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Interperating CI

Comparing ratio data (relative risk, odds ratio, hazard ratio)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

CI and estimation…narrow vs wide CI/ meaning

A

A narrow CI range implies high precision, while a wide CI range implies poor precision. For example, a study comparing metoprolol to placebo finds a 12% absolute risk reduction (ARR) in heart failure progression, with a 95% CI range of 6-35%. This can be written as ARR 12% (95% CI 6%-35%) or as ARR 0.12 (95% CI 0.06, 0.35). The CI indicates 95% confidence that the true ARR for the population lies between 6% and 35%. A wider range, such as 4%-68%, indicates less precision, making it unclear where within that range the true value lies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Type 1 Errors: False-Positive

A

In the scenario described, a Type I error occurs when the alternative hypothesis is accepted and the null hypothesis is rejected in error. The probability of making a Type I error is determined by alpha, which is related to the confidence interval. When alpha is 0.05 and a study result reports p < 0.05, it is statistically significant, and the probability of a Type I error is less than 5%. This means you are 95% confident (0.95 = 1 - 0.05) that the result is correct and not due to chance.
.
Cl = 1 - a (type I error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Type II Errors False Negative

A

The probability of a Type II error, denoted as beta (β), occurs when the null hypothesis is accepted when it should have been rejected. Beta is typically set at 0.1 or 0.2, indicating a 10% or 20% risk of a Type II error. This risk increases with a small sample size. To decrease this risk, a power analysis is performed to determine the necessary sample size to detect a true difference between groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

False positive, false negative in H0 relationships

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Risk and Relative Risk (Risk Ratio) calculation/ formula

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

A placebo -controlled study was performed to evaluate whether metoprolol reduces disease progression in patients with heart failure (HF). A total of 10,111 patients were enrolled and followed for 12 months. What is the relative risk of HF progression in
the metoprolol-treated group versus the placebo group? Calculate the risk of HF progression in each group. Then calculate RR and interperate it

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

RELATIVE RISK REDUCTION (RRR): Interpretation and formula

A

The RR calculation determines whether there is less risk (RR< 1) or more risk (RR> 1). The relative risk reduction (RRR) is calculated after the RR and indicates how much the risk is reduced in the treatment group, compared to the control group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Using the risks previously calculated for HF progression in the treatment and control groups (metoprolol: 16% and placebo: 28%), calculate the RRR of HF progression.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

ABSOLUTE RISK REDUCTION: Interpretation and formula

A

Absolute risk reduction is more useful because it includes the reduction in risk and the incidence rate of the outcome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

ARR Calculation
Using the risks previously calculated for HF progression in the metoprolol study, calculate the ARR of HF progression.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

NUMBER NEEDED TO TREAT (NNT): interpretation and formula

A

NNT is the number of patients who need to be treated for a certain period of time (e.g., one year) in order for one patient to benefit ( e.g., avoid HF progression).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

NNT Calculation
The ARR in the metoprolol study was 12%. The duration of the study period was one year. Calculate the number of patients that need to be treated with metoprolol for one year in order to prevent one case of HF progression.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

NUMBER NEEDED TO HARM (NNH): interpretation/ formula

A

NNH is the number of patients who need to be treated for a certainperiod of time in order for one patient to experience harm.
.
NNT and NNH are calculated with the same formula. There are two differences: NNT is rounded up and NNH is rounded down

26
Q

NNH Calculation
A study evaluated the efficacy of dopidogrel versus placebo, both given in addition to aspirin, in reducing the risk of cardiovascular death, MI and stroke. The study reported a 3.9% risk of major bleeding in the treatment group and a 2.8% risk of major bleeding
in the control group.

A
27
Q

Odd Ratio Formula

A
28
Q

OR Calculation
A case-control study was conducted to assess the risk of falls with fracture (outcome) associated with serotonergic antidepressant (AD) use (exposure) among a cohort of Chinese females >65 years old. Cases were matched with 33,000 controls (1:4, by age, sex and cohort entry date).

A
29
Q

HAZARD RATIO: Formula

A

A hazard rate is the rate at which an unfavorable event occurs within a short period of time

30
Q

HR Calculation
A placebo-controlled study was performed to evaluate whether niacin, when added to intensive statin therapy, reduces cardiovascular risk in patients with established cardiovascular disease. The primary endpoint was the first event of the composite endpoint (death from coronary heart disease, nonfatal myocardial infarction, ischemic stroke, hospitalization for an acute coronary syndrome or coronary or cerebral revascularization). A total of 3,414 patients were enrolled and followed for three years. Calculate the hazard ratio.

A
31
Q

OR AND HR INTERPRETATION

A
32
Q

PRIMARY AND COMPOSITE ENDPOINTS

A

The primary endpoint is the main result measured to determine if the treatment had a significant benefit. In the metoprolol trial, the primary endpoint was heart failure progression.

A composite endpoint combines multiple individual endpoints into one measurement. This approach is attractive to researchers because it increases the likelihood of achieving a statistically significant benefit with a smaller, less costly trial.

33
Q

TYPES OF STATISTICAL TESTS

A
34
Q

CORRELATION AND REGRESSION

CORRELATION: Define

A

Correlation is a statistical technique used to determine if one variable (e.g., number of days hospitalized) is related to another (e.g., incidence of hospital-acquired infection). If the independent variable (hospital days) increases the dependent variable (infections), the correlation is positive. If it decreases the dependent variable, the correlation is negative.

Different tests are used for different data types. Spearman’s rank-order correlation (Rho) is used for ordinal data, while Pearson’s correlation coefficient (r) is used for continuous data, indicating the strength and direction of the relationship on a scale from -1 to +1. However, correlation does not imply causation.

35
Q

CORRELATION AND REGRESSION

REGRESSION: Define

A

Regression describes the relationship between a dependent variable and one or more independent variables, showing how much the dependent variable changes with the independent variable. It is common in observational studies to assess multiple variables and control for confounding factors. The three main types of regression are: 1) linear, for continuous data, 2) logistic, for categorical data, and 3) Cox regression, for categorical data in survival analysis.

36
Q

Sensitivity vs Specificity

A

Sensitivity (True Positive)

Sensitivity measures how effectively a test identifies patients with the condition. Higher sensitivity is better; a test with 100% sensitivity will be positive for all patients with the condition. Sensitivity is the percentage of true positive results, calculated from the number who test positive out of those who actually have the condition.

Specificity (True Negative)

Specificity measures how effectively a test identifies patients without the condition. Higher specificity is better; a test with 100% specificity will be negative for all patients without the condition. Specificity is the percentage of true negative results, calculated from the number who test negative out of those who actually do not have the condition.

37
Q

Sensitivity and Specificity Formula

A
38
Q

INTENTION TO TREAT AND PER PROTOCOL ANALYSIS

A

Intention-to-Treat Analysis: Includes all patients originally allocated to each treatment group, regardless of whether they completed the trial according to the study protocol. This method provides a conservative (real-world) estimate of the treatment effect.

Per Protocol Analysis: Includes only the subset of patients who completed the study according to the protocol, without major violations. This method can provide an optimistic estimate of the treatment effect since it includes only those who adhered to the protocol.

39
Q

NONINFERIORITY AND EQUIVALENCE TRIAL DESIGNS

A

Equivalence Trials: Demonstrate that the new treatment has roughly the same effect as the reference treatment, testing for both higher and lower effectiveness (two-way margin).

Non-Inferiority Trials: Demonstrate that the new treatment is no worse than the current standard, based on a predefined non-inferiority (delta) margin, which is the minimal clinically acceptable difference between the two groups.

40
Q

Forest Plots

A

Forest plots are graphs used in single studies, where individual endpoints are pooled into a composite endpoint, or in meta-analyses combining results from multiple studies. They provide confidence intervals (CIs) for difference or ratio data and aid in interpreting statistical significance.

When interpreting statistical significance:
- Boxes represent effect estimates, with larger boxes indicating larger effects in meta-analyses.
- Horizontal lines through boxes show the length of the confidence interval, with wider intervals indicating less reliable results.
- Diamonds at the bottom represent pooled results from multiple studies.
- A vertical solid line represents no effect; significant benefit is to the left and significant harm to the right. This line is set at zero for difference data and at one for ratio data.

41
Q

TYPES OF MEDICAL STUDIES: From most reliable to unreliable

A
42
Q

Briefly define term

CASE-CONTROL STUDY

A

Compare patients with a disease (cases) to those without (controls). The outcome is known, but the researcher retrospectively examines the relationship between the disease and various risk factors.
.
basically: you start with the disease (or no disease) and you look back in time (retrospective only) to see the exposure

43
Q

Briefly define term

COHORT STUDY

A

Cohort studies compare outcomes between patients exposed and not exposed to a treatment. Researchers follow both groups prospectively (in the future) or less commonly, retrospectively, to observe outcomes.

Benefits: Suitable for assessing outcomes when intervention would be unethical.

Cons: More time-consuming and expensive than retrospective studies. Prone to influence by confounders, which are other factors affecting the outcome (e.g., smoking, lipid levels).
.
Basically: you start with the exposure and you look retrospectively OR prospectively to observe disease/ outcome

44
Q

Briefly define term

CASE REPORT AND CASE SERIES

A

Describes an adverse reaction or unique condition in either a single patient (case report) or a few patients (case series), where the outcome is known. Case series are more reliable than case reports.

45
Q

Briefly define term

RANDOMIZED CONTROLLED TRIAL (RCT)

A

In a randomized controlled trial (RCT), an experimental treatment is compared to a control (placebo or existing treatment) to determine superiority. Subjects meeting specific criteria (inclusion criteria) are carefully selected, while those with characteristics potentially influencing the outcome are excluded (exclusion criteria). Patients are randomized (equal chance of treatment or control) and sometimes blinded (unaware of treatment). A double-blind design means both patient and investigator are unaware of treatment assignments.

46
Q

The different subtype of RTC

A

Randomized controlled trials compare patients who were randomly assigned to study groups.
.
In a parallel trial, the patients remain in the same group (i.e., treatment or control group) throughout the study.
.
Crossover design means that patients are initially assigned to one group (e.g., the treatment arm) but are switched to the other group (e.g., the placebo arm) during the trial.
.
Factorial design is a trial in which there are more than two groups randomized.

47
Q

Briefly define term

META-ANALYSIS

A

Combines results from multiple studies in order to develop a conclusion that has greater statistical power than is possible from the individual smaller studies

48
Q

Briefly define term

SYSTEMATIC REVIEW ARTICLE

A

Summary of the clinical literature that focuses on a specific topic or question

49
Q

How do pharmacoeconomic analyses contribute to optimal healthcare resource allocation, and what are the key methods used to evaluate the costs and outcomes of pharmaceutical interventions?

A

Pharmacoeconomics evaluates pharmaceutical interventions using techniques that measure and compare costs (direct, indirect, intangible) and outcomes (clinical, economic, humanistic). Key methods include cost-effectiveness, cost-minimization, cost-utility, and cost-benefit analyses.

Distinct from broader outcomes research, pharmacoeconomics focuses on pharmaceutical products and services. Healthcare providers and payers use these methods to assess total costs and outcomes, with study perspectives influencing the results. For example, lost productivity costs matter to patients and employers but less to health plans.

These analyses complement traditional efficacy and safety data, translating clinical benefits into economic and patient-centered terms. They guide optimal healthcare resource allocation in a standardized, evidence-based manner.

50
Q

The ECHO model

A

The ECHO model (Economic, Clinical and Humanistic Outcomes) provides a broad evaluative framework to assess the outcomes associated with diseases and treatments.
Economic outcomes: include direct, indirect and intangible costs of the drug compared to a medical intervention.
Clinical outcomes: include medical events that occur as a result of the treatment or intervention.
Humanistic Outcomes: include consequences of the disease or treatment as reported by the patient or caregiver (e.g., patient satisfaction, quality of life).

51
Q

MEDICAL COST CATEGORIES: DIRECT, INDIRECT AND INTANGIBLE

A
52
Q

Incremental Cost-Effectiveness Ratios: Define and Formula

A

Incremental cost-effectiveness ratios (ICERs) represent the change in costs and outcomes when comparing two treatment alternatives. ICERs are calculated to show the additional cost required to produce an additional unit of effect, using the formula: ICER = (C1 - C2) / (E1 - E2), where C represents costs and E represents effects.

53
Q

If spending $200 on Drug A results in 5 treatment successes while spending $300 on Drug B results in 7 treatment successes, what is the incremental cost ratio?

A
54
Q

Four Basic Pharmacoeconomic Methodologies

A
  • cost-benefit analysis can be used to compare programs with similar or unrelated (as in this example) outcomes, as long as the outcome measures can be converted to dollars.
  • cost-minimization analysis (CMA) is used when two or more interventions have demonstrated equivalent outcomes and the costs of each intervention are being compared. CMA measures and compares the input costs and assumes outcomes are equivalent.
  • cost-effectiveness analysis compares the clinical effects of treatment (e.g., mortality, BP lowering effects, A1C lowering effects) to the costs of treatment.
  • Cost-utility analysis includes a quality of life component, generally expressed in quality-adjusted life years (QALYs)
55
Q

Interpretation

Treatment with zoledronic acid reduced the risk of morphometric vertebral fractures by 70% during a 3-year period compared with placebo (3.3% in the zoledronic-acid group vs. 10.9% in the placebo group; relative risk, 0.30; 95% CI, 0.24 to 0.38). What is the interpretation of relative risk?

A

The relative risk for morphometric vertebral fractures is 0.30. The correct interpretation of the RR is that the patients who received zoledronic acid were 30% as likely to experience vertebral fractures as those taking placebo (not 30% less likely since 30% less than 10.9% is 7.63%).
.
Key word is “as likely!” - it’s the risk of something!

56
Q

Interpretation

What is the interpretation of relative risk reduction (RRR)/ key word?

A
57
Q

Interpretation

What is the interpretation of absolute risk reduction?/ key word?

A
58
Q

RRR vs. ARR?

A
59
Q

Interpretation

NNT

A
60
Q

Interpretation

NNH

A
61
Q

Interpretation

Odd ratio

A
62
Q
A