Lecture 31-33: More Biostats Flashcards

1
Q

What are the 4ish measures of central tendency and dispersion

A
  • MODE/MEDIAN/MEAN
  • OUTLIERS
  • MINIMUM / MAXIMUM / RANGE
  • INTERQUARTILE RANGE (IQR)
  • See Slide 20
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define Variance and Standard Deviation

A
  • VARIANCE (from Mean)
  • The average of the squared differences in each individual measurement value and the groups’ mean
    (There is No way I can type this but I’ll try:)
    SUM OF: ((x-x)squared) /n
  • STANDARD DEVIATION (SD)
  • square root of variance value (restores units of mean)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe a normally distributed graphical representation of data

A
  • Graphical representation shows SHAPE of data
  • NORMALLY DISTRIBUTED = Symmetrical
    • When a dataset is normally-distributed the following values (PARAMETERS) are EQUAL/NEAR EQUAL: Mean / Median / Mode (Stats tests useful for normally-distributed data are called “PARAMETRIC” tests)
    • Equal dispersion of curve “tails” to both sides of mean, median, & mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Describe a positively skewed graphical representation

A

POSITIVELY SKEWED

  • Asymmetrical distribution with one “tail” longer than another
  • A distribution is skewed anytime the Median Differs From The Mean
  • When mean is higher than median, “positive skew”.
  • Tail pointing to the right
  • Positive skew (skew to right): mean > median
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe a negatively skewed graphical representation

A

NEGATIVELY SKEWED

  • Asymmetrical distribution with one “tail” longer than another
  • A distribution is skewed anytime the Median Differs From the Mean
  • When mean is LOWER than MEDIAN, “negative skew”.
  • Tail pointing to the left
  • negative skew (skew to left): mean < median
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define skewness

A
  • A measure of the asymmetry of a distribution

- The perfectly-normal distribution is symmetric and has a SKEWNESS VALUE OF 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define kurtosis

A
  • A measure of the extent to which observations cluster around the mean. For a normal distribution, the value of the kurtosis statistic is 0
  • Positive kurtosis – more cluster
  • Negative kurtosis - less cluster
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the required assumptions of interval data for proper selection of a parametric studies

A

REQUIRED ASSUMPTIONS OF INTERVAL DATA (for proper selection of a PARAMETRIC Test):

  1. NORMALLY-DISTRIBUTED
  2. Equal variances
    * Multiple tests available to assess for equal variances between groups
    - LEVENE’S TEST**
    - Kolmogorov-Smirnoff
    - Bartlett’s or F-Test
  3. RANDOMLY-DERIVED & INDEPENDENT
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the protocol for handling data that is Not normally distributed?

A
  • HANDLING INTERVAL DATA NOT NORMALLY-DISTRIBUTED
  • Use a statistical test that Does Not Require the data to be normally-distributed (NON-PARAMETRIC TESTS), or
  • Transform data to a standardized value (Z-SCORE OR LOG)
    • hoping transformation allows data to be normally-distributed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe the 2 impacts of statistical significance

A
  • POWER (1-β)
  • The ability of a study design to detect a true difference if one truly exists between group-comparisons, and therefore…
  • The level of accuracy in correctly accepting/rejecting the Null Hypothesis (analogous to Sensitivity in screenings)
  • Sample Size
  • The larger the sample size, the greater the likelihood (ability) of detecting a difference if one truly exists
  • Increase in Power
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the 3 statistical elements to consider when determining a sample size?

A
  1. MINIMUM DIFFERENCE BETWEEN THE GROUPS DEEMED SIGNIFICANT
    - The smaller the difference between groups necessary to be considered “significant” (important), the greater sample size (number; or ‘N’) needed
  2. EXPECTED VARIATION OF MEASUREMENT (KNOWN OR ESTIMATED)
  3. ALPHA (TYPE 1) & BETA (TYPE 2) ERROR RATES & CONFIDENCE INTERVAL
    - Add in anticipated drop-outs or loss to follow-ups ***
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe Null Hypothesis (H0)

A
  • A research perspective which states there will be No (true) difference between the groups being compared
  • Most conservative and commonly utilized
  • Various statistical-perspectives can be taken by the researcher:
    • Superiority
    • Noninferiority
    • Equivalency
    • Researchers either ACCEPT or REJECT this perspective, based on STATISTICAL ANALYSIS
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe alternate hypothesis (h1)

A

A research perspective which states there Will Be a (true) difference between the groups being compared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Describe a Type I Error

A

Type 1 Error (a.k.a. ; alpha)

  • REJECTING the Null Hypothesis when it is actually TRUE, and you Should Have Accepted It!
  • There really is no true differences between the groups being compared but you (in error) reject the Null Hypothesis thereby ultimately stating that you believe there is a difference between groups (when there really is NOT!)
    • Somewhat analogous to the concept of a FALSE POSITIVE in medical screenings
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe a Type II Error

A

Type II Error = Beta Error

  • NOT REJECTING the Null Hypothesis when it is actually FALSE, and you Should Have Rejected It!
  • There really IS a true difference between the groups being compared but you (in error) do NOT reject the Null Hypothesis thereby ultimately stating that you believe there is no difference between groups (when there really IS!)
    • Somewhat analogous to the concept of a FALSE NEGATIVE in medical screenings
  • See table on slide 50
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

** Describe the p-value **

A
  • Statistical tests determine possible error rate or chance in compared difference or relationship between variables
    1. A test statistic Value is calculated, then
    2. The test statistic value is compared to the appropriate Table Of Probabilities for that test, then
    3. A PROBABILITY(p) value is obtained; based on the probability of observing, Due To Chance Alone, a test statistic value as extreme or more extreme than actually observed if groups were similar (not different)
  • The PROBABILITY value is selected by investigators before the study starts (a priori)
17
Q

What occurs when the p-value is lower than the predetermined alpha range (usually 0.05)

A
  • If the p value is LOWER than the pre-selected alpha value (customarily 5% (0.05))* then we say it’s Statistically Significant
  • Based on an acceptably-low probability (less than 5%) that the value of the test statistic could be as large as it is BY CHANCE ALONE if the groups were similar
  • if < the alpha percentage-risk of error, we REJECT the Null Hypothesis
    • THE RISK OF EXPERIENCING A TYPE I ERROR IS ACCEPTABLY LOW (LESS THAN 5%)
18
Q

** Describe the interpretation of a pre-set p-value **

A
  • The PROBABILITY of making a Type 1 error if the Null Hypothesis is rejected
  • The PROBABILITY of erroneously claiming a difference between groups when one does not really exist
  • The PROBABILITY of obtaining group differences as great or greater if the groups were actually the same/equal
  • The PROBABILITY of obtaining a test statistic as high/higher if the groups were actually the same/equal
19
Q

Describe the Confidence Interval (CI)

A
  • CONFIDENCE INTERVAL (CI) (most common selections are 90%, 95%, or 99%)
  • CI’s (a high and a low value) are calculated at an a priori percentage of confidence that statistically the real (yet unknown) difference or relationship resides
  • BASED ON:
    • Variation in sample (V/SD), and
    • Sample size (N)
  • Journals are moving away from solely reporting p values; or showing them at all
20
Q

Describe a Point Estimate

A
  • Comparisons of groups generates only a single-POINT ESTIMATE of the “true” yet unknown difference (0) or relationship (1) between groups
21
Q

Describe the interpretation of a 95% confidence interval

With and without a p-value

A
  • We are 95% confident that the “true” difference (0) or relationship (1) between the groups is contained within the confidence interval range.
  • Without a p-value:
  • If CI CROSSES 1.0 (for RATIOS (OR/RR/HR) or 0.0 (for other comparisons (e.g., interval variables) = NOT SIGNIFICANT [(p>0.05)]
22
Q

What is the First Key Question to Selecting the Correct Statistical Test? **

A
  • *
    1. What DATA LEVEL is being recorded?
    a. Does the data have MAGNITUDE? (yes/no)
    b. Does the data have a fixed, measureable INTERVAL along the entire scale? (yes/no)
23
Q

What is the Second Key Question to Selecting the Correct Statistical Test? **

A
  1. What TYPE OF COMPARISON/ASSESSMENT is desired?
    –Correlation –> CORRELATION TEST
    - Nominal Correlation test = Contingency Coefficient
    - Ordinal Correlation test = Spearman Correlation
    - Interval Correlation test = Pearson Correlation
    ** – p>0.05 for a Pearson Correlation just means there is no LINEAR correlation; there may still be NON-LINEAR correlations present!

– All Correlations Can Be Run As A “Partial Correlation” To Control For Confounding

  • Event-Occurrence / Time-to-Event –> SURVIVAL TEST
24
Q

Describe Correlation

A

Correlation (r)

  • Provides a QUANTITATIVE measure of the STRENGTH & DIRECTION of a relationship between variables
  • VALUES RANGE FROM -1.0 TO +1.0

Partial Correlation
* A correlation that controls for confounding variables