Statistics Flashcards

1
Q

Confidence Intervals

A

A range of values so defined that there is a SPECIFIED PROBABILITY that the value of
a parameter lies within it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Effect Size

A
  • Magnitude of an intervention reflected by an index value.
  • Can be calculated
    from data in a clinical trial.
  • It is mostly INDEPENDENT of sample size.
  • Most interventions have small
    to moderate effect sizes.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Effectiveness

A

How well an intervention performs under “real-world” circumstances.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Efficacy

A

How well an intervention performs under IDEAL and CONTROLLED circumstances.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Fidelity

A

(1) Extent to which delivery of an intervention ADHERES to the protocol or program model originally developed and…
(2) How CLOSE the intervention REFLECTS the appropriateness of the care that should be provided.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Minimally Clinically Important Difference (MCID)

A

Smallest difference in score in the domain of interest which patients perceive as BENEFICIAL and which would mandate (barring troublesome side
effects, $$$) a CHANGE in the pt management

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

P value

A

The probability of obtaining a result EQUAL to or MORE EXTREME than what was actually observed (assuming no difference in groups). Usually p = 5% (0.05).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Personalized Medicine vs Precision Medicine

A

PERSONALIZED = study of tailoring of medical treatment to the individual CHARACTERISTICS of each patient

PRECISION = uses information about a person’s genes, proteins, &
environment to prevent, diagnose, and treat disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Reliability

A

Degree to which the result of a measurement, calculation, or specification can be depended on to be PRECISE.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Statistical Significance

A

Claim that a result from data generated by testing or experimentation is NOT likely to occur RANDOMLY or by CHANCE, but is instead likely to be attributable to a
specific cause.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Validity

A

Extent to which the instrument measures what it was designed to measure.
(multiple types of validity, each representing a different construct)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Types of data (4)

A

Nominal, Ordinal, Interval, Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Nominal Data

A

2 categories, e.g. Yes/no; boy/girl

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Ordinal Data

A

Has order but not rank.

E.g. strongly agree, agree, disagree, and strongly disagree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Interval Data

A

Has rank AND order.

E.g. 1-4, 5-8, 9-12, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Ratio Data

A

Has rank, order, and is COUNTABLE.

E.g. weight, temperature, age

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Parametric vs Non-parametric Tests

A

Parametric tests: test group MEANS

  • Used when data are normally distributed
  • Data from multiple groups have the same variance
  • Data have a linear relationship

Nonparametric tests: test group MEDIANS

  • “distribution-free tests” - they don’t assume that data follow a specific distribution
  • Can be used with smaller sample sizes, & when you want to be more conservative with your analyses.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Means are used with [parametric / non-parametric ] tests.

Medians are used with [parametric / non-parametric ] tests.

A

Means are used with PARAMETRIC tests.

Medians are used with NON-PARAMETRIC tests.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Parametric tests are used when data have…
- normal or non-normal distribution?
- groups have same or different variance?
- data are linearly or non-linearly related?
…so, you’ll be comparing [means / medians]

A

Parametric tests are used when data have…

  • NORMAL distribution (though can also be used when assuming a particular [though non-normal] distribution). Typically requires a LARGE sample size to get a normal distribution.
  • groups have SAME VARIANCE
  • data are LINEARLLY related

…so, you’ll be comparing MEANS

(otherwise, use non-parametric tests!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

You need a statistical test for 2 samples.

What are your options, depending on if your data are parametric vs non-parametric?

A

2-samples
Parametric = t-test
Non-parametric = Mann-Whitney U test

21
Q

When is it appropriate to use a Mann-Whitney U test?

What if you have >2 samples?

A

2 samples, NON-parametric data

> 2 samples = ANOVA (“ANOVA by rank” for use with non-parametric data)

22
Q

When is it appropriate to use a t-test?

What if you have >2 samples?

A

2 samples, parametric data

> 2 samples = ANOVA (aka F test, aka “ANOVA sum of squares” for use with parametric data)

23
Q

You need a statistical test for 2 samples that are paired.

What are your options, depending on if your data are parametric vs non-parametric?

A

Paired 2-samples
Parametric = Paired t-test
Non-parametric = Wilcoxon

24
Q

When is it appropriate to use a paired t-test?

A

Paired 2 samples, parametric data

25
Q

When is it appropriate to use a Wilcoxon test?

A

Paired 2 samples, non-parametric data

26
Q

You need a statistical test to analyze distribution.

What are your options, depending on if your data are parametric vs non-parametric?

A

Distribution
Parametric = Chi squared (for large samples with n>20 [at least]; Fisher exact test for smaller samples)
Non-parametric = Kolmogorov-Smirnov (can also use Fisher exact test for non-parametric data or small parametric samples…which again, these are hard to assume parametricitiy given often not perfectly “normal” data in small sample size)

27
Q

When is it appropriate to use a Chi squared test?

What about a Fisher exact test?

What about a Kolmogorov-Smirnov test?

A

All used with distribution

Chi squared best with PARAMETRIC data, bigger samples (n>20)

Fisher exact test can be used with smaller sample (n<20) parametric data, vs with non-parametric data

Kolmogorov-Smirnov can be used to look at distribution in non-parametric data

28
Q

You need a statistical test that can deal with >2 samples, +/- several dependent variables.

What are your options, depending on if your data are parametric vs non-parametric?

A

> 2 samples

Parametric = ANOVA; with several dependent variables, use a MANOVA

Non-Parametric = Kruskal Wallis

29
Q

When is it appropriate to use a Kruskal Wallis test?

A

> 2 samples, non-parametric data

30
Q

What is a test of association?

A

Association = is there a RELATIONSHIP between 2 or more variables

The type of test, of course, varies based on if data are parametric vs non-parametric

31
Q

I want to look at the association between two groups. Yay!
What test do I use for parametric data?
For non-parametric data?

(Bonus: what if you have >2 groups??)

A

I want to look at the association between two groups. Yay!
Parametric data = Pearson Product
Non-parametric data = Kendall Tau

> 2 groups? Use the correlation matrix version of each of the above tests

32
Q

I want to look at association between groups, with one dependent variable (and multiple independent variables).

What test do I use for parametric data?
For non-parametric data?

A

I want to look at the association between groups, with one dependent variable (and multiple independent variables).
Parametric data = linear regression
Non-parametric data = logistic regression

33
Q

When would it be appropriate to use a LINEAR regression?

  • Number of dependent vs independent variables?
  • Parametric vs non-parametric data
A

When would it be appropriate to use a linear regression?

  • ONE dependent variable, MULTIPLE independent variables
  • PARAMETRIC data
34
Q

When would it be appropriate to use a LOGISTIC regression?

  • Number of dependent vs independent variables?
  • Parametric vs non-parametric data
A

When would it be appropriate to use a LOGISTIC regression?

  • ONE dependent variable, MULTIPLE independent variables
  • NON-PARAMETRIC data
35
Q

I want to look at association between groups, with multiple dependent variables.

What test do I use for parametric data?
For non-parametric data?

A

I want to look at the association between groups, with multiple dependent variables.
Parametric data = Ologit regression (categories are ORDERED)
Non-parametric data = Discriminate Analysis (categories are NOT ORDERED) or use Multinominal regression

36
Q

When would it be appropriate to use an OLOGIT regression?

  • Number of dependent vs independent variables?
  • Parametric vs non-parametric data
  • Categories that are ordered vs non-ordered?
A

When would it be appropriate to use an OLOGIT regression?

  • MULTIPLE dependent variables
  • PARAMETRIC data
  • Categories that are ORDERED
37
Q

When would it be appropriate to use a DISCRIMINATE ANALYSIS or MULTINOMIAL regression?

  • Number of dependent vs independent variables?
  • Parametric vs non-parametric data
  • Categories that are ordered vs non-ordered?
A

When would it be appropriate to use a DISCRIMINATE ANALYSIS or MULTINOMIAL regression?

  • MULTIPLE dependent variables
  • NON-PARAMETRIC data
  • Categories that are NOT ORDERED
38
Q

Sensitivity

A

Proportion of TRUE POSITIVES

(proportion by percentage of patients
who DO have the disease of interest who
register a POSITIVE test finding)

SnNOut (therefore, with high sensitivity, a NEGATIVE tests helps you more confidently rule OUT for the dx)

39
Q

Specificty

A

Proportion of TRUE NEGATIVES

(proportion by percentage of patients
who do NOT have the disease of interest
who register a NEGATIVE test finding)

SpPIN (therefore, with high specificity, a POSITIVE test helps you more confidently rule IN for the dx)

40
Q

Positive Predictive Value

A

Probability that subjects with a POSITIVE TEST truly DO have the disease

41
Q

Negative Predictive Value

A

Probability that subjects with a NEGATIVE TEST truly DO NOT have the disease

42
Q

Positive Likelihood Ratio

A

LR+ Commonly used to rule in a condition

(probability of a true positive) / (probability of a false positive)

Positive LR = sensitivity / (100 – specificity).

AKA:
(Probability of a patient WITH the
disease and a POSITIVE test) divided
by the (probability of a patient without
the disease and a positive test).
43
Q

Negative Likelihood Ratio

A

LR- Commonly used to rule out a condition

(probability of a false negative) / (probability of a true negative)

Negative LR = (100 – sensitivity) / specificity.

AKA: 
(Probability of a person who has the
disease testing negative) divided by the
(probability of a person who does not
have the disease testing negative)
44
Q

Effect size

A

Measures the strength of treatment effect (magnitude of the intervention)
INDEPENDENT of sample size - very useful when evaluating data from under- or over-powered studies’

In randomized trials (comparative studies), effect sizes are often reported as “trivial, small, moderate, or large”

45
Q

Odds ratio

A

Odds that an outcome will occur given a particular exposure vs to the odds of
the outcome occurring in the ABSENCE of that exposure
OR >1 = finding is more likely
OR <1 = finding is less likely

46
Q

What is the relationship between sensitivity, specificity & likelihood ratios?

A

Positive LR = sensitivity / (100 – specificity).

Negative LR = (100 – sensitivity) / specificity.

Recall:

  • Sensitivity = TRUE POSITIVE rate
  • Specificity = TRUE NEGATIVE rate
47
Q

How do you interpret positive likelihood ratios?

0-1 ?
1 ?
1 - infinity?

A

Tell you likelihood of a disease/condition/result.

0-1: Decreased likelihood of disease.
+LR 1/2 (0.5) = 15% less likely
+LR 1/5 (0.2) = 30% less likely
+LR 1/10 (0.1) = 45% less likely

1: Null, no diagnostic value.

> 1: increased evidence for disease. High +LR helps you rule IN for a disease.
+LR 2 = 15% more likely
+LR 5 = 30% more likely
+LR 10 = 45% more likely

An LR over 10 is very strong evidence to rule in a disease.

48
Q

Journal Impact Factor

A

“How much of this journal being cited during the most recent X (often 2 or 5) years?”

Reflects the number of citations made in the current year to articles in the previous two years, divided by the total number of citable articles from the previous two or five years

49
Q

Describe the 4 clinical phases in research trials

A

Phase I: assess the SAFETY of an intervention.
Phase II: test EFFICACY of the intervention in a tightly controlled environment.
Phase III: EFFECTIVENESS (randomized and blinded testing in a REAL WORLD environment)
Phase IV: tests the impact of the intervention for costs, overall long-term care, etc.