Biostatistics Flashcards

1
Q

Continuous data and types

A

Data w/ logical order with values that continuously increase or decrease by the SAME amount

-Interval data: NO meaningful (zero does NOT equal non - ex. C or F degrees)
-Ratio data: meaningful zero (ex. HR, age, weight, BP)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Discrete data and types

A

Data with categories

-Nominal: subjected into arbitrary order (“Yes/No Data”: gender, ethnicity, mortality)

-Ordinal: ranked in logical order, but the different between categories is NOT equal (ex. NYHA functional classes, 0-10 pain scale)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Methods of Describing Data:
-Measures of Central Tendency (mean, median, mode), which types of data are most preferred with each method?
-Spread (Variability) of data: range and standard deviation

A

Measures of Central Tendency:
-Mean: average value (preferred for continuous data that is normally distributed)
-Median: value in middle (preferred for ordinal data or continuous data that is skewed)
-Mode: most frequent value (preferred for nominal data)

Variability of Data:
-Range: diferrence between highest and lowest values
-Standard deviation (SD): indicates how spread out data is from the mean (highly dispersed = larger SD)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Gaussian Distribution:

  1. Large samples of _________ data
  2. Normal distribution will appear symmetrical (bell shaped) when ______ is the center of the curve.
  3. To be 1 standard deviation (SD) means that the values will fall within ____% and to be within 2 SDs means the values will fall within _____%.
  4. When the curve becomes more narrow or more wide, what happens to the range?
  5. How can curves become skewed, and what is the best way to analyze the data?
A
  1. Continuous
  2. Mean, median, and Mode –> all of them should be equal within normal distribution
  3. 68%; 95%
  4. -More narrow: smaller range (more PRECISE)
    -More wide: higher range
  5. Can occur w/ smaller amounts of data or outliers in data –> MEDIAN is a better measure of central tendency
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Dependent versus Independent Variables

A

Independent variable changed (manipulated) by the researcher to determine whether it has an effect on the DEPENDENT variable (outcome)

Independent variables are decided in with inclusion criteria (thus why age though can’t be “changed” is still an independent variable)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Hypothesis and Determining Significance:
-Nul hypothesis (H0)
-Alternative hypothesis (Ha)
-What is used to determine significance of study/how to analyze?

A

H0: no difference between groups
-What researcher tries to reject

Ha: there is a statistical difference between the groups
-What the researcher hopes to prove

Alpha level: error margin to determine significance of study (usually 5% or 0.05 –> can be smaller, but typically requires more cost due to more subjects and data needed)
-P-value >/= alpha: H0 accepted and Ha rejected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Confidence Intervals (CI)
1. CI = __________________ How to interpret?

  1. Explain what this means: Absolute relative risk 12% (95% CI 6-35%)
  2. How to interpret CI based on ranges given with difference data versus ratio data?
A
  1. CI = 1 - alpha
    -If alpha = 0.5 and p <0.05, there is 95% probability that conclusion is correct
  2. There is 95% likelihood that the true value of ARR is within the range of 6 to 35%.
    -A larger range = less precise data

3.
-Comparing difference (means) data: significant if CI does NOT include zero
-Comparing ratio data (relative risk, odds ratio, hazard ratio), significant if CI does NOT include one (division 100/100=1 meaning that there is no difference between groups)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Errors:
1. What are type I and II errors?

  1. How is the probability of these errors occurring determined?
  2. Which error is considered worse to make?
A

Type I Error: null hypothesis rejected in error (false positive)
-Probability of making this error is based on alpha (if alpha is 0.05, risk is <5%)
-WORSE error to make of the two

Type II Error: null hypothesis is accepted in error (false negative)
-Probability of making this error is based on beta (typically 0.1 or 0.2 aka 10% or 20%)
-Power: probability the test will reject null hypothesis correctly (determined by # of outcome values collected, difference in outcome rates between groups, and the significance - alpha level)
-Power = 1 - beta (as power increases, beta decreases)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Risk and Relative Risk:
-Risk (R) = ___________________________
-Relative Risk (RR) = ________________
-Relative Risk Reduction (RRR) = ____________
-Interpretation of values

A

R = [# of subjects w/ unfavorable event] / [# of subjects in study arm]

RR = [R in TX group] / [R in control group]
-RR = 1 (100%): NO difference in risk between groups
-RR >1: HIGHER risk in TX group (>1.5: means increased risk from TX)
-RR<1: LOWER risk in TX group (ideally: <0.5: means reduced risk overall)
-Ex. if RR = 0.57: TX group was 57% AS LIKELY to have disease progression as placebo treated pts

RRR = [R in control group - R in TX group] / [R in control group]
-RRR = 1 - RRR (how much the risk is reduced)
-Ex. if RRR = 0.43: TX group was 43% LESS LIKELY to have disease progression as placebo treated pts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Calculate and Interpret the following: Absolute Risk Reduction (ARR)

A

ARR = [R in control group] - [R in TX group]
-MORE useful than RRR because it includes RISK reduction and INCIDENCE rate of outcome (ex. if risk was reduced, but risk was small to begin with, then RRR can provide skewed perspective)
-Ex. if ARR = 0.12, this means that 12 of every 100 patients will have fewer disease progression than placebo treated pts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Calculate and Interpret the following:
-Number Needed to Treat (NNT)
-Number Needed to Harm (NNH)

A

NNT = 1 / [R in control group - R in TX group] = 1 / ARR
-The number of patients who need to be treated for a certain period of time in order for ONE patient to benefit
-Value is rounded UP regardless of decimal value
-Ex. NNT = 9: for every 9 pts that receive the drug, one patient will be prevented from the disease

NNH: 1 / ARR (when calculating ARR for NNH, number will be NEGATIVE - take absolute value aka use positive version)
-The number of patients who need to be treated for a certain period time in order for ONE patient to experience an adverse event
-Value is rounded DOWN regardless of decimal value
-Ex. NNH = 90: for every 90 patients that receive the drug, one patient will experience the adverse event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Odds and Hazard Ratio: When to use these ratios

A

OR: case-control studies (cannot use relative risk calculations; medical charts reviewed retrospectively); also can be used in cohort and cross-sectional studies

HR: in survival analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Odds and Hazard Ratio: How to calculate and interpret these values

A

Odds Ratio (OR) : probability that an event will occur WITH an exposure versus WITHOUT the exposure = [AD / BC]
-A: pts w/ outcome w/ exposure
-D: pts w/o outcome w/o exposure
-B: pts w/o outcome w/ exposure
-C: pts w/ outcome w/o exposure

Hazard Ratio (HR): rate at which an unfavorable event occurs within a SHORT PERIOD OF TIME (similar to RR)
-HR = [hazard rate in TX group] / [hazard rate in control group]
-ASSUMES ratio is constant over time

Interpretation:
-OR or HR = 1: event rate in the same between groups
-OR or HR >1: HIGHER event rate in TX group
-OR or HR <1: LOWER event rate in TX group

Ex. OR of 1.23 means there was a 23% increased risk of outcome when given drug

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Primary versus composite endpoints

A

Primary endpoint; the main result measured to see if TX has significant benefit (distinct and seperate)
-Ex. metoprolol study: HF progression

Composite endpoint: combines mutliple individual endpoints into one measurement
-MUST use measurements with relatively similar magnitude or value
-Individual endpoint values will NOT add up
-Ex. metoprolol study: death from CV events + nonfatal stroke + nonfatal MI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Selecting a test to analyze data: continuous data (parametric data: normally distributed)
-1 group
-2 groups
-3 groups or more

**Have to know?

A

1 group:
-One sample: one-sample t-test
-Singe sample w/ pre and post measurements (group is “their own control”): dependent/pair t-test

2 groups: independent/unpaired student t-test

3 groups or more: ANOVA (F-test)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Selecting a test to analyze data: discrete data
-1 group
-2 groups
-3 groups or more

**Have to know?

A

1 group: Chi-square test
-If one sample w/ before and after measures: Wilcoxon Sgined-Rank test

2 groups: Chi-square test or Fisher’s exact test (same as Chi-square test, but just used for smaller sample sizes)
-Mann-Whitney (Wilcoxon Rank-Sum) test may be preferred for ordinal data

3 groups or more: Kruskal-Wallis test

17
Q

Correlation:
-Define
-Positive versus negative
-Spearman’s rank order correlation versus Pearson’s correlation coefficient

A

Correlation: used to determine if one variable is related to (changes) another variable: DOES NOT PROVE A CAUSAL RELATIONSHIP
-Positive correlation: variable increases the other variable (increases to the right)
-Negative correlation: variable decreases the other variable (decreases to the right)

Spearman’s rank-order correlation: used for ordinal, ranked data

Pearson’s correlation coefficient: used for continuous data
-Scatterplot used to determine r which is always between -1 and +1 (-1 = perfect negative correlation, 0 = no correlation, +1 = perfect positive correlation)

18
Q

What is regression?

A

Used to describe the relationship between a dependent variable and one or more independent variables

-Used when researchers in observationship studies need to assess multiple independent variables or need to control many cofounding factors

-Types: linear (continuous data), logistic (categorical data), Cox (categorical data in survival analysis)

19
Q

Labs and Diagnostic Testing: Sensitivity vs Specificity

A

Sensitivity - how effectively the test identifes a pt with the condition (true positives)
-Higher sensitivity is preferred
-Sensitivity = pt who tests positive and has condition / all pts with the condition

Specificity - how effectively the test identifies a pt without the condition (true negatives)
-Higher specificity is preferred
-Specificity = pt who test negative and do NOT have condition / all pts who do NOT have condition

20
Q

Intention-to-Treat versus Per Protocol Analysis

A

Intention-to-Treat (ITT): all data is included for pts orginally allocated to each treatment group even if the pt did NOT complete the trial according to study protocol (ex. non-compliance, protocol deviations, study withdrawl)
-MORE REALISTIC, but UNDERESTIMATES benefits of TX

Per Protocol Analysis: data used from subjects who completed the study according to the protocol
-Can provide optimistic estimate of treatment effects for those adherent

21
Q

Equivalence trials versus non-inferiority trials

A

Equivalence trials: designed to demonstrate that a new TX has roughly the same effect as reference TX
-Two way margin test: tests effects in two directions (higher or lower effectiveness)

Non-inferiority trials: designed to demonstrate that a new TX is NOT worse than a reference TX
-Based on delta margin (the minimal difference in effect between two groups that is considered clinically acceptable based on previous research)

22
Q

Forest Plots:
-When to use
-What does the line and the boxes/diamond in the middle of the line mean?
-What is the vertical line?

A

When to use: individual endpoints are pooled together, therefore helpful in meta-analysis where multiple studies are pulled into one

Line: confidence interval
-Difference data: if line crosses 0, NOT significant
-Ratio data: if line crosses 1, NOT significant

Box in little of line: effect estimate (in meta-analysis: size correlates to seize of effect shown)
-Diamond (at the very bottom): represents pooled results from multiple studies

Vertical line: “line of no effect”
-Difference data: vertical line is 0
-Ratio data: vertical line is 1

23
Q

Evidence-based medicine: list study designs from most reliable to least

A

Most Reliable: systematic Reviews and Meta-analyses

-Randomized controlled trials
-Cohort studies
-Case-controlled studies
-Case series and case reports

Least Reliable: expert opinion

24
Q

Case-control studies:
-Study Design
-Benefits
-Limitations

A

Study design: retrospective comparisons of cases (pts w/ disease) and controls (pt w/o a disease)
-Outcomes of the case and controls are already known, but researcher looks back in time to see if a relationship exists
-Start with disease, look for exposure

Benefits:
-Data easy to get from medical records
-Less expensive than RCTs
-Good for looking at outcomes when intervention is unethical

Limitations: cause and effect cannot reliably be determined

25
Q

Cohort Study:
-Study design
-Benefits
-Limitations

A

Study Design: compares outcomes of a group of pts exposed and not exposed to TREATMENT
-Can be prospective or retrospective
-Start with exposure and look for disease

Benefits: good for looking at outcomes when intervention would be unethical

Limitations:
-More time consuming and expensive than a retrospective study
-Can be influenced by confounders which are other factors that can influence an outcome (ex. smoking, lipid levels)

26
Q

Case Report and Case Series:
-Study Design
-Benefits
-Limitations

A

Study design: describes an adverse reaction or unique condition that appears in a SINGLE PATIENT (case report) or a FEW PATIENTS (case series)

Benefits:
-Can identify new diseases, drug side effects, or potential uses
-Generates hypotheses that can be tested w/ other study designs

Limitations: conclusions CANNOT be drawn from a a single or few cases

27
Q

Randomized Control Trial (RCT):
-Study Design
-Single vs Double Blinded
-Open label

A

Study Design: compares experimental TX to a control (placebo or existing TX) to determine which is better
-Inclusion and exclusion critera for patients to be included
-RANDOMIZATION of pts between groups with equal chance of assignment

Single blinded: pt is unaware of what they are receiving, but investigator knows

Double blinded: pt and investigator do not know what pt is receiving

Open label: unblinded; all parties know what TX is being given

28
Q

Randomized Control Trial (RCT):
-Parallel vs Crossover
-Benefits
-Limitations

A

Parallel: subjects randomized to TX or control arm for ENTIRE study

Cross-over: randomized to one group then crossover to second group (pts serve as their own control)

Benefits: preferred study type to determine cause and effect or superiority; less potential for bias

Limitations: time-consuming, expensive, and may not reflect real-life scenarios when rigorous exclusion criteria are used

29
Q

Meta-Analysis:
-Study Design
-Benefits
-Limitations

A

Study Design: combines results from MULTIPLE STUDIES in order to develop a conclusion that has greater statistical power

Benefits: smaller studies can be pooled instead of performing a large, expensive study

Limitations: studies may not be UNIFORM (size, inclusion, exclusion criteria) and validity can be compromised if lower quality studies are weighted equally to higher quality studies

30
Q

Explain what the following studies are:
1. Cross-sectional survey
2. Factorial Design
3. Systemic Review Article

A

Cross-sectional survey: estimates relationship between variables and outcomes (prevalence) at one particular time (cross-section) in a defined population

Factorial Design: randomizes to more than the usual two groups to test a number of experimental conditions - can evaluate multiple interventions

Systemic Review Analysis: summary of clinical literature that focuses on a specific topic or question
-Begins w/ question followed by literature search then info summarization and somtimes includes a meta analysis to synthesize results

31
Q

Pharmacoeconomics:
-Define
-ECHO Model: discuss economic outcomes, Clinical Outcomes, Humanistic Outcomes

A

Pharmacoeconomics: assessment of cost consequences of pharmaceutical products and services (NOT outcome research) with goals to optimize resources

ECHO Model:
-Economic outcomes: includes direct (ex. medical and non-medical such as travel, household costs, drug prep/adiminstration), indirect (ex. lost work time/productivity, mortality/morbidity), and intangible costs (ex. pain, suffering, anxiety, fatigue) of the drug compared to medical intervention

-Clinical outcomes: medical events that occur as a result of the TX or intervention

-Humanisitic Outcomes: consequences of the disease or TX as reported by the pt or caregiver (pt QOL/satisfaction)

32
Q

Average cost effectiveness ratio and incremental cost effectiveness ratio: define and calculate

A

Average cost effectiveness ratio: cost of outcome per one TX independent of other TX alternatives
-Average cost effectiveness ratio = cost of outcomes / patients treated

Incremental cost effectiveness ratio: change in costs and outcomes when comparing two TXs
-Incremental cost ratio = (C2 - C1) / (E2 - E1)
-C = cost
-E = effects
-Ex. Incremental cost ratio = $50, then drug B costs $50 more than drug A for additional TX success

33
Q

Pharmacoeconomic Methodologies: define (what is the cost measurment unit and the outcome unit?)
-Cost-minimization analysis (CMA)
-Cost-benefit analysis (CBA)

A

Cost-minimization analysis (CMA): compares cost of intervention w/ demonstrated equivalence (between 2 equal outcomes, what is cheaper?)
-Cost measurement unit: dollars
-Outcome unit: equivalent

Cost-benefit anaylsis (CBA): compares benefits and costs in monetary units
-Cost measurement unit: dollars
-Outcome unit: dollars (difficult to assign)

34
Q

Pharmacoeconomic Methodologies: define (what is the cost measurment unit and the outcome unit?
-Cost-effectiveness analysis (CEA)
-Cost-utility analyis (CUA)

A

Cost-effectiveness analysis (CEA): MOST COMMON
-Cost measurement unit: dollars
-Outcome unit: clinical units (easy to quantify)

Cost-utility analysis (CUA): outcomes based on QOL assessments
-Cost measurement unit: dollars
-Outcome unit: QALY (Quality Adjusted Lifey Year)

35
Q

What is HRQOL?

A

Health-related quality of life: commonly included under borad umbrella of assessments know as patient-reported outcomes (PROs)