Statistics Flashcards

1
Q

Confidence Intervals

A

A range of values so defined that there is a SPECIFIED PROBABILITY that the value of
a parameter lies within it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Effect Size

A
  • Magnitude of an intervention reflected by an index value.
  • Can be calculated
    from data in a clinical trial.
  • It is mostly INDEPENDENT of sample size.
  • Most interventions have small
    to moderate effect sizes.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Effectiveness

A

How well an intervention performs under “real-world” circumstances.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Efficacy

A

How well an intervention performs under IDEAL and CONTROLLED circumstances.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Fidelity

A

(1) Extent to which delivery of an intervention ADHERES to the protocol or program model originally developed and…
(2) How CLOSE the intervention REFLECTS the appropriateness of the care that should be provided.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Minimally Clinically Important Difference (MCID)

A

Smallest difference in score in the domain of interest which patients perceive as BENEFICIAL and which would mandate (barring troublesome side
effects, $$$) a CHANGE in the pt management

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

P value

A

The probability of obtaining a result EQUAL to or MORE EXTREME than what was actually observed (assuming no difference in groups). Usually p = 5% (0.05).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Personalized Medicine vs Precision Medicine

A

PERSONALIZED = study of tailoring of medical treatment to the individual CHARACTERISTICS of each patient

PRECISION = uses information about a person’s genes, proteins, &
environment to prevent, diagnose, and treat disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Reliability

A

Degree to which the result of a measurement, calculation, or specification can be depended on to be PRECISE.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Statistical Significance

A

Claim that a result from data generated by testing or experimentation is NOT likely to occur RANDOMLY or by CHANCE, but is instead likely to be attributable to a
specific cause.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Validity

A

Extent to which the instrument measures what it was designed to measure.
(multiple types of validity, each representing a different construct)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Types of data (4)

A

Nominal, Ordinal, Interval, Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Nominal Data

A

2 categories, e.g. Yes/no; boy/girl

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Ordinal Data

A

Has order but not rank.

E.g. strongly agree, agree, disagree, and strongly disagree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Interval Data

A

Has rank AND order.

E.g. 1-4, 5-8, 9-12, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Ratio Data

A

Has rank, order, and is COUNTABLE.

E.g. weight, temperature, age

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Parametric vs Non-parametric Tests

A

Parametric tests: test group MEANS

  • Used when data are normally distributed
  • Data from multiple groups have the same variance
  • Data have a linear relationship

Nonparametric tests: test group MEDIANS

  • “distribution-free tests” - they don’t assume that data follow a specific distribution
  • Can be used with smaller sample sizes, & when you want to be more conservative with your analyses.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Means are used with [parametric / non-parametric ] tests.

Medians are used with [parametric / non-parametric ] tests.

A

Means are used with PARAMETRIC tests.

Medians are used with NON-PARAMETRIC tests.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Parametric tests are used when data have…
- normal or non-normal distribution?
- groups have same or different variance?
- data are linearly or non-linearly related?
…so, you’ll be comparing [means / medians]

A

Parametric tests are used when data have…

  • NORMAL distribution (though can also be used when assuming a particular [though non-normal] distribution). Typically requires a LARGE sample size to get a normal distribution.
  • groups have SAME VARIANCE
  • data are LINEARLLY related

…so, you’ll be comparing MEANS

(otherwise, use non-parametric tests!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

You need a statistical test for 2 samples.

What are your options, depending on if your data are parametric vs non-parametric?

A

2-samples
Parametric = t-test
Non-parametric = Mann-Whitney U test

21
Q

When is it appropriate to use a Mann-Whitney U test?

What if you have >2 samples?

A

2 samples, NON-parametric data

> 2 samples = ANOVA (“ANOVA by rank” for use with non-parametric data)

22
Q

When is it appropriate to use a t-test?

What if you have >2 samples?

A

2 samples, parametric data

> 2 samples = ANOVA (aka F test, aka “ANOVA sum of squares” for use with parametric data)

23
Q

You need a statistical test for 2 samples that are paired.

What are your options, depending on if your data are parametric vs non-parametric?

A

Paired 2-samples
Parametric = Paired t-test
Non-parametric = Wilcoxon

24
Q

When is it appropriate to use a paired t-test?

A

Paired 2 samples, parametric data

25
When is it appropriate to use a Wilcoxon test?
Paired 2 samples, non-parametric data
26
You need a statistical test to analyze distribution. What are your options, depending on if your data are parametric vs non-parametric?
Distribution Parametric = Chi squared (for large samples with n>20 [at least]; Fisher exact test for smaller samples) Non-parametric = Kolmogorov-Smirnov (can also use Fisher exact test for non-parametric data or small parametric samples...which again, these are hard to assume parametricitiy given often not perfectly "normal" data in small sample size)
27
When is it appropriate to use a Chi squared test? What about a Fisher exact test? What about a Kolmogorov-Smirnov test?
All used with distribution Chi squared best with PARAMETRIC data, bigger samples (n>20) Fisher exact test can be used with smaller sample (n<20) parametric data, vs with non-parametric data Kolmogorov-Smirnov can be used to look at distribution in non-parametric data
28
You need a statistical test that can deal with >2 samples, +/- several dependent variables. What are your options, depending on if your data are parametric vs non-parametric?
>2 samples Parametric = ANOVA; with several dependent variables, use a MANOVA Non-Parametric = Kruskal Wallis
29
When is it appropriate to use a Kruskal Wallis test?
>2 samples, non-parametric data
30
What is a test of association?
Association = is there a RELATIONSHIP between 2 or more variables The type of test, of course, varies based on if data are parametric vs non-parametric
31
I want to look at the association between two groups. Yay! What test do I use for parametric data? For non-parametric data? (Bonus: what if you have >2 groups??)
I want to look at the association between two groups. Yay! Parametric data = Pearson Product Non-parametric data = Kendall Tau >2 groups? Use the correlation matrix version of each of the above tests
32
I want to look at association between groups, with one dependent variable (and multiple independent variables). What test do I use for parametric data? For non-parametric data?
I want to look at the association between groups, with one dependent variable (and multiple independent variables). Parametric data = linear regression Non-parametric data = logistic regression
33
When would it be appropriate to use a LINEAR regression? - Number of dependent vs independent variables? - Parametric vs non-parametric data
When would it be appropriate to use a linear regression? - ONE dependent variable, MULTIPLE independent variables - PARAMETRIC data
34
When would it be appropriate to use a LOGISTIC regression? - Number of dependent vs independent variables? - Parametric vs non-parametric data
When would it be appropriate to use a LOGISTIC regression? - ONE dependent variable, MULTIPLE independent variables - NON-PARAMETRIC data
35
I want to look at association between groups, with multiple dependent variables. What test do I use for parametric data? For non-parametric data?
I want to look at the association between groups, with multiple dependent variables. Parametric data = Ologit regression (categories are ORDERED) Non-parametric data = Discriminate Analysis (categories are NOT ORDERED) or use Multinominal regression
36
When would it be appropriate to use an OLOGIT regression? - Number of dependent vs independent variables? - Parametric vs non-parametric data - Categories that are ordered vs non-ordered?
When would it be appropriate to use an OLOGIT regression? - MULTIPLE dependent variables - PARAMETRIC data - Categories that are ORDERED
37
When would it be appropriate to use a DISCRIMINATE ANALYSIS or MULTINOMIAL regression? - Number of dependent vs independent variables? - Parametric vs non-parametric data - Categories that are ordered vs non-ordered?
When would it be appropriate to use a DISCRIMINATE ANALYSIS or MULTINOMIAL regression? - MULTIPLE dependent variables - NON-PARAMETRIC data - Categories that are NOT ORDERED
38
Sensitivity
Proportion of TRUE POSITIVES (proportion by percentage of patients who DO have the disease of interest who register a POSITIVE test finding) SnNOut (therefore, with high sensitivity, a NEGATIVE tests helps you more confidently rule OUT for the dx)
39
Specificty
Proportion of TRUE NEGATIVES (proportion by percentage of patients who do NOT have the disease of interest who register a NEGATIVE test finding) SpPIN (therefore, with high specificity, a POSITIVE test helps you more confidently rule IN for the dx)
40
Positive Predictive Value
Probability that subjects with a POSITIVE TEST truly DO have the disease
41
Negative Predictive Value
Probability that subjects with a NEGATIVE TEST truly DO NOT have the disease
42
Positive Likelihood Ratio
LR+ Commonly used to rule in a condition (probability of a true positive) / (probability of a false positive) Positive LR = sensitivity / (100 – specificity). ``` AKA: (Probability of a patient WITH the disease and a POSITIVE test) divided by the (probability of a patient without the disease and a positive test). ```
43
Negative Likelihood Ratio
LR- Commonly used to rule out a condition (probability of a false negative) / (probability of a true negative) Negative LR = (100 – sensitivity) / specificity. ``` AKA: (Probability of a person who has the disease testing negative) divided by the (probability of a person who does not have the disease testing negative) ```
44
Effect size
Measures the strength of treatment effect (magnitude of the intervention) INDEPENDENT of sample size - very useful when evaluating data from under- or over-powered studies' In randomized trials (comparative studies), effect sizes are often reported as "trivial, small, moderate, or large"
45
Odds ratio
Odds that an outcome will occur given a particular exposure vs to the odds of the outcome occurring in the ABSENCE of that exposure OR >1 = finding is more likely OR <1 = finding is less likely
46
What is the relationship between sensitivity, specificity & likelihood ratios?
Positive LR = sensitivity / (100 – specificity). Negative LR = (100 – sensitivity) / specificity. Recall: - Sensitivity = TRUE POSITIVE rate - Specificity = TRUE NEGATIVE rate
47
How do you interpret positive likelihood ratios? 0-1 ? 1 ? 1 - infinity?
Tell you likelihood of a disease/condition/result. 0-1: Decreased likelihood of disease. +LR 1/2 (0.5) = 15% less likely +LR 1/5 (0.2) = 30% less likely +LR 1/10 (0.1) = 45% less likely 1: Null, no diagnostic value. > 1: increased evidence for disease. High +LR helps you rule IN for a disease. +LR 2 = 15% more likely +LR 5 = 30% more likely +LR 10 = 45% more likely An LR over 10 is very strong evidence to rule in a disease.
48
Journal Impact Factor
"How much of this journal being cited during the most recent X (often 2 or 5) years?" Reflects the number of citations made in the current year to articles in the previous two years, divided by the total number of citable articles from the previous two or five years
49
Describe the 4 clinical phases in research trials
Phase I: assess the SAFETY of an intervention. Phase II: test EFFICACY of the intervention in a tightly controlled environment. Phase III: EFFECTIVENESS (randomized and blinded testing in a REAL WORLD environment) Phase IV: tests the impact of the intervention for costs, overall long-term care, etc.