29‐34 Biostatistics Flashcards

1
Q

What is a population?

A

All individuals • Not to be confused with the “study population”, which is simply the final group of individuals selected for a study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a sample?

A

A subset or portion of the full population (“representatives”) • Useful when studying the complete population is not feasible • Random processes commonly utilized to draw sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are statistical analyses?

A

Comparisons made in relation to Null Hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What inferences will be made about the results?

A

o Inferences will be made about the sample‐derived measurements and their comparisons (in relation to Null Hypothesis)  Inferences will also be made to the full population of similar subjects (generalizability)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

On what variables will data be collected?

A

 Dependent variable(s) [outcome variables]  Independent variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a null hypothesis?

A

A research perspective which states there will be no (true) difference between the groups being compared  Most conservative and commonly utilized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the various statistical‐perspectives can be taken by the researcher?

A

• Superiority • Noninferiority • Equivalency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is an alternate hypothesis?

A

A research perspective which states there will be a (true) difference between the groups being compared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are 2 key attributes of data measurement?

A
  1. Magnitude (or Dimensionality) 2. Consistency of scale (or Fixed Interval)  Equal, measurable spacing between units
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is another attribute of data measurement?

A
  1. Rational/Absolute Zero Each attribute can be assessed with a “Yes” or “No” response
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the 3 categories for data (variables) measurement that ultimately determine the statistical test?

A

Nominal, Ordinal, Interval/Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define nominal

A

o Dichotomous/Binary; Non‐Ranked Named Categories

o No Magnitude / No Consistency of scale / No Rational Zero

o Nominal variables are simply labeled variables without quantitative characteristics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

True or false, ALL data that is categorized into two categories is instantly nominal.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Define ordinal

A

Ranked Categories; Non‐Equal‐Distance

o Yes Magnitude / No Consistency of scale / No Rational Zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Define interval/ratio

A

Order; Magnitude; Equal Intervals‐of‐scale (units)

o Yes Magnitude / Yes Consistency of scale / No or Yes Rational Zero (No‐Interval; Yes‐Ratio)

o Number of Living Siblings & Personal Age (in years)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

True or false, after data is collected, we can appropriately go down in specificity/detail of data measurement (levels), but never up.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Is the ratio level absolute zero?

A

yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

In interval data measurement, what is meaningful?

A

distance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

In what level can attributes be ordered?

A

ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Which data measurement level is the weakest?

A

nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Which level are the attributes only named?

A

nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Which levels are discrete?

A

nominal and ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Which levels are continuous?

A

interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

True or false, all statistical tests are selected based on level of data being compared.

A

true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What are the measures of central tendency & dispersion?

A

Mode / Median / Mean Outliers Minimum / Maximum / Range Interquartile Range (IQR)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Define variance

A

difference in each individual measurement value and the groups’ mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is standard deviation?

A

square root of variance value (restores units of mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What does the graphical representation depict?

A

SHAPE of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is normal distribution?

A

Symmetrical  When a dataset is normally‐distributed the following values (PARAMETERS) are equal/near equal: • Mean / Median / Mode  Equal dispersion of curve “tails” to both sides of mean, median, & mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What are stats tests useful for normally‐distributed data called?

A

Parametric tests

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is a positively skewed distribution?

A

Asymmetrical distribution with one “tail” longer than another A distribution is skewed anytime the median differs from the mean • When mean is higher than median, “positive skew”. • Tail pointing to the right positive skew (skew to right): mean > median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is a negatively skewed distribution?

A

Asymmetrical distribution with one “tail” longer than another A distribution is skewed anytime the median differs from the mean • When mean is lower than median, “negative skew”. • Tail pointing to the left positive skew (skew to right): mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Define skewness

A

A measure of the asymmetry of a distribution o The perfectly‐normal distribution is symmetric and has a skewness value of 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is kurtosis?

A

A measure of the extent to which observations cluster around the mean. For a normal distribution, the value of the kurtosis statistic is 0.  Positive kurtosis – more cluster  Negative kurtosis ‐ less cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

How do you handle interval data not normally‐distributed?

A

o Use a statistical test that does not require the data to be normally‐distributed (non‐parametric tests), or o Transform data to a standardized value (z‐score or log)  hoping transformation allows data to be normally‐distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What are the three required assumptions of interval data?

A
  1. Normally‐distributed 2. Equal variances  Multiple tests available to assess for equal variances between groups 3. Randomly‐derived & Independent
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

True or false, ALWAYS run Descriptive Statistics & Graphs for not normally-distributed interval data.

A

True

38
Q

Define Power (1‐β)

A

The ability of a study design, its methodology, and the selected test statistic to detect a true difference if one truly exists between group‐comparisons, and therefore…  The level of accuracy in correctly accepting/rejecting the Null Hypothesis (analogous to Sensitivity in screenings)

39
Q

True or false, the larger the sample size, the greater the likelihood (ability) of detecting a difference if one truly exists.

A

true

40
Q

Sample Size Determination

A
  1. Minimum difference between groups deemed significant o The smaller the difference between groups necessary to be considered “significant” (important), the greater number needed (“N”) 2. Expected variation of measurement (known or estimated) 3. Alpha (Type 1) & Beta (Type 2) Error Rates (Power) Add in anticipated drop‐outs or loss to follow‐ups
41
Q

Two Basic Questions for Stats

A
  1. What is the single measurement value most likely to represent the true (yet unknown) difference or relationship between the groups being compared and what is the probability the difference has occurred by chance? A. Addressed by the p value derived from a statistical test 2. What is the plausible range of possible difference or relationship within which we believe the true difference or relationship may lie? A. Addressed by the confidence interval (CI)
42
Q

Define p value.

A

Statistical tests determine possible differences or relationships between variables 1. A test statistic value is calculated, then 2. The test statistic value is compared to the appropriate table of probabilities for that test, then 3. A probability (p) value is obtained; based on the probability of observing, due to chance alone, a test statistic value as extreme or more extreme than actually observed if groups were similar (not different)  The probability is selected by investigators before the study starts (a priori)

43
Q

What is statistical significance?

A

If the p value is lower than the pre‐selected a priori value (customarily 5% (0.05))* then we say it’s statistically significant o Based on an acceptably‐low probability (less than 5%) that the value of the test statistic could be as large as it is by chance alone if the groups were similar  if

44
Q

What is a type I error?

A

Rejecting the Null Hypothesis when it is actually TRUE, and you should have accepted it!  There really is no true differences between the groups being compared but you (in error) reject the Null Hypothesis thereby ultimately stating that you believe there is a difference between groups (when there really is NOT!) • Analogous to a false positive in medical screenings

45
Q

What is a type II error?

A

Not rejecting the Null Hypothesis when it is actually FALSE, and you should have rejected it!  There really IS a true difference between the groups being compared but you (in error) do NOT reject the Null Hypothesis thereby ultimately stating that you believe there is no difference between groups (when there really IS!) • Analogous to a false negative in medical screenings

46
Q

What are the possible interpretations of a pre‐set (a priori) p value?

A

o The probability of making a Type 1 error if the Null Hypothesis is rejected o The probability of erroneously claiming a difference between groups when one does not really exist o The probability of the outcome of the group’s differences occurring by chance o The probability of obtaining group differences as great or greater if the groups were actually the same/equal o The probability of obtaining a test statistic as high/higher if the groups were actually the same/equal

47
Q

Describe confidence intervals.

A

most common selections are 90%, 95%, or 99% o CI’s (a high and a low value) are calculated at an a priori percentage of confidence that statistically the real (yet unknown) difference or relationship resides o Based on:  Variation in sample (V/SD), and  Sample size (N)

48
Q

True or false, journals are moving away from solely reporting p values; or showing them at all.

A

true

49
Q

True or false, comparisons of groups generates only a single‐point estimate of the “true” yet unknown difference (0) or relationship (1) between groups.

A

true

50
Q

Describe the interpretation of a 95% CI.

A

We are 95% confident that the “true” difference (0) or relationship (1) between the groups is contained within the confidence interval range.

51
Q

Describe the interpretation of a 95% CI without a p value.

A

If CI crosses 1.0 (for RATIOS (OR/RR/HR) or 0.0 (for other comparisons (e.g., interval variables) = Not Significant [(p>0.05)]

52
Q

What is the depicted graph called?

A

forest plot

53
Q

When do you ask the following question?

Does “statistical” significance confer meaningful, “clinical” significance?

A

ALWAYS ask this question when reviewing the findings of a study

54
Q

What are the 4 KEY QUESTIONS to Selecting the Correct Statistical Test?

A
  1. What TYPE OF DATA is being collected/evaluated?
  2. What TYPE OF COMPARISON/ASSESSMENT is desired?
  3. *HOW MANY GROUPS are being compared?
  4. *Is the data INDEPENDENT or RELATED (PAIRED)?
55
Q

What is an example of an ordinal data measurement?

A

pain scale with faces or numbers

56
Q

What are accompanying questions to the 4 key questions?

A

Does the data have MAGNITUDE? (yes/no)

Does the data have a fixed, measureable INTERVAL along the entire scale? (yes/no)

Is data from the same (paired) or different groups (independent)?

57
Q

Describe the types of correlation tests.

A

– Nominal Correlation test = Contingency Coefficient

– Ordinal Correlation test = Spearman Correlation

– Interval Correlation test = Pearson Correlation

– p>0.05 for a Pearson Correlation just means there is no linear correlation; there may still be non‐linear correlations present!

Ø All Correlations can be run as a “partial correlation” to control for confounding

58
Q

Describe correlation (r).

A

Provides a quantitative measure of the strength & direction of a relationship between variables

 Values range from ‐1.0 to +1.0

59
Q

Describe survival tests.

A

Compares the proportion of, or time‐to, event occurrences between groups Commonly represented by a Kaplan‐Meier curve

60
Q

Describe the types of survival tests.

A

– Nominal Survival test = Log‐Rank test

– Ordinal Survival test = Cox‐Proportional Hazards test

– Interval Survival test = Kaplan‐Meier test

 All can be represented by a Kaplan‐Meier curve

61
Q

What is regressions?

A

o Provide a measure of the relationship between variables by allowing the prediction about the dependent, or outcome, variable (DV) knowing the value/category of independent variables (IV’s)

o Also able to calculate OR for a Measure of Association

62
Q

Describe the types of regression.

A

– Nominal Regression test = Logistic Regression

– Ordinal Regression test = Multinomial Logistic Regression

– Interval Regression test = Linear Regression

63
Q

What test is used for nominal data involving 2 groups of independent data?

A

(Pearson’s) Chi‐square test (x^2)

64
Q

What test is used for nominal data involving _>_3 groups of independent data?

A

Chi‐square test of Independence (x^2)

65
Q

What are the assumptions of the Chi‐square test?

A
  • Usual chi‐square (binomial) distribution for nominal‐type data
  • No cell with expected count of <5
66
Q

What do the (Pearson’s) Chi‐square test and Chi‐square test of Independence compare?

A

Both tests compares group proportions and if they are different from that expected by chance

67
Q

What test is used for nominal data involving ≥2 Groups with EXPECTED cell count of <5?

A

Fisher’s Exact test

68
Q

Which of the following tests would be most appropriate if the researchers wished to compare the within‐ subjects HgbA1c from baseline to end‐of‐study (assume normal distribution & equal variances) for inhaled technosphere insulin compared to subcutaneous regular human insulin?

A. ANOVA

B. Paired t‐test

C. Wilcoxon signed rank test

D. Kendall test

E. Student‐Newman‐Keul test

A

B. Paired t test

69
Q

Which of the following tests would be most appropriate if the researchers wished to compare the mean blood sugar between treatment groups (assume normal distribution & equal variances) for inhaled technosphere insulin compared to subcutaneous regular human insulin?

A. Cochran test

B. Fisher’s exact test

C. Kruskal‐Wallis test

D. Student t‐test

E. Mann‐Whitney test

A

D. Student t test

70
Q

Which of the following tests would be most appropriate if the researchers wished to compare, between the 2 treatment groups, the number of days the patient was on therapy before they had a VTE recurrence?

A. ANOVA

B. Chi‐square test

C. Kruskal‐Wallis test

D. Multinomial logistic regression

E. Freidman test

A

A. ANOVA

71
Q

Which of the following tests would be most appropriate if the researchers wished to compare the proportion of patients in each of the 2 treatment groups who developed (or didn’t) a recurrent VTE?

A. ANOVA

B. Chi‐square test

C. Kruskal‐Wallis test

D. Multinomial logistic regression

E. Freidman test

A

B. Chi-square test

72
Q

If the researchers wished to assess for differences in the time‐to‐event (survival); the event being diagnosis of depression & onset of suicidal ideations, which of the following tests would be most appropriate (assume the data was not normally distributed)?

A. ANOVA

B. Cox proportional hazards test

C. Kaplan‐Meier product‐limit estimate

D. Multinomial logistic regression

E. Freidman test

A

B. Cox proportional hazards test

73
Q

Researchers want to conduct a study to identify early predictors of which young children with ADHD are at greatest risk for depression and suicide ideations, judged as present/absent. They performed 7 different psychometric assessments of depression and suicidal behavior. Which of the following tests would be most appropriate for the researchers’ primary purpose?

A. Student t‐test

B. Linear regression

C. ANOVA

D. Logistic regression

E. McNemar test

A

D. Logistic regression

74
Q

What is a Validation/Assessment Committee?

A

o Kappa statistic – agreement between evaluators (consistency of “decisions”, “determinations”)

o Kappa Interpretation:  +1 = The observers perfectly “classify” everyone exactly the same way

 0 = There is no relationship at all between the observers’ “classifications”, above the agreement that would be expected by chance

 ‐1 = The observers “classify” everyone exactly the opposite of each other

• Kappa (K) value can be + or ‐; + = good agreement; ‐ = poor agreement

75
Q

List and describe the interval data post‐hoc tests for 3 or more Group Comparisons.

A

o Student‐Newman‐Keul test

 Compares all pairwise comparisons possible

 All groups must be equal in size

o Dunnett test

 Compares pairwise comparisons against a single control

 All groups must be equal in size

o Dunn test

 Compares all pairwise comparisons possible

 Useful when all groups are not of equal size

o Tukey or Scheffe tests

 Compares all pairwise comparisons possible

 All groups must be equal in size

  • Tukey test slightly more conservative than the Stu.N.K.
  • Scheffe test less affected by violations in normality and homogeneity of variances – most conservative

o Bonferroni correction

 Adjusts the p value for # of comparisons being made

• Very conservative

76
Q

List and describe the interval data tests for ≥3 groups of paired/related data.

A

o Repeated Measures ANOVA (1 DV)

 Compares the means of all groups (along with intra‐ and inter‐ group variations) of related data against a single DV

o Repeated Measures MANOVA (≥2 DVs)

 Compares the means of all groups (along with intra‐ and inter‐ group variations) of related data against multiple DV’s

• If 3+ group comparison significant, must perform a post‐hoc test to determine where differences are…

77
Q

List and describe the interval data tests for ≥3 Groups of Paired/Related Data w/ Confounders?

A

o Repeated Measures ANCOVA

 Compares the means of all groups (along with intra‐ and inter‐group variations) against a single DV while also controlling for the co‐variance of confounders

o Repeated Measures MANCOVA (≥2 DVs)

 Compares the means of all groups (along with intra‐ and inter‐group variations) against multiple DV’s while also controlling for the co‐variance of confounders

78
Q

List and describe the interval data tests for 2 Groups of Paired/Related Data?

A

o Paired t‐test

 Compares the mean values between groups that are related

79
Q

List and describe the interval data tests for ≥3 Groups of Independent Data w/ Confounders?

A

o Analysis of Co‐Variance (ANCOVA)

 Compares the means of all groups (along with intra‐ and inter‐group variations) against a single DV while also controlling for the co‐variance of confounders

o Multiple Analysis of Co‐Variance (MANCOVA) (≥2 DVs)

 Compares the means of all groups (along with intra‐ and inter‐group variations) against multiple DV’s while also controlling for the co‐variance of confounders

80
Q

List and describe the interval data tests for ≥3 Groups of Independent Data?

A

o Analysis of Variance (ANOVA) (1 DV)

 Both tests compares the means of all groups (along with intra‐ and inter‐group variations) against a single DV

o Multiple Analysis of Variance (MANOVA) (≥2 DVs)

 Compares the means of all groups (along with intra‐ and inter‐group variations) against multiple DV’s

• If 3+ group comparison significant, must perform a post‐hoc test to determine where differences are…

81
Q

List the interval data tests for 2 Groups of Independent Data?

A

o Student t‐test

82
Q

List and describe the ordinal Post‐hoc Tests for 3 or more Group Comparisons.

A

o Student‐Newman‐Keul test

 Compares all pairwise comparisons possible

 All groups must be equal in size

o Dunnett test

 Compares all pairwise comparisons against a single control

 All groups must be equal in size

o Dunn test  Compares all pairwise comparisons possible

 Useful when all groups are not of equal size

83
Q

List and describe the ordinal data tests for ≥3 Groups of Paired/Related Data.

A

o Friedman test

 Both tests compares the median values between groups

• Each also effective for non‐normally distributed Interval data or don’t meet all parametric requirements

 If 3+ group comparison significant, must perform a post‐hoc test to determine where differences are…

84
Q

What are the KEY WORDS FOR “PAIRED” or “RELATED” DATA?

A

“Pre‐ vs. Post‐”, “Before vs. After”, “Baseline vs. End”, etc…

85
Q

List the ordinal data test for 2 Groups of Paired/Related Data.

A

o Wilcoxon Signed Rank test

86
Q

List and describe the ordinal data tests for ≥3 Groups of Independent Data.

A

o Kruskal‐Wallis test

 Both tests compares the median values between groups

• Both also used for Interval data not meeting parametric requirements

 If 3+ group comparison significant, must perform a post‐hoc test to determine where difference(s) is(are)…

87
Q

List the ordinal data test for 2 Groups of Independent Data

A

o Mann‐Whitney test

88
Q

List and describe the nominal data tests for ≥3 Groups of Paired/Related Data.

A

o Cochran

 Same principle and assumptions as

2 yet mathematically factors in concept of paired, or related, data

 Bonferroni test of Inequality (Bonferroni correction)

  • Adjusts the p value for # of comparisons being made
  • Very conservative
89
Q

List the nominal data test for 2 Groups of Paired/Related Data.

A

o McNemar test

90
Q

List and describe the nominal data tests for ≥3 Groups of Independent Data.

A

o For statistically significant findings (p<0.05) in 3 or more comparisons, one must perform subsequent analysis (post‐hoc testing) to determine which groups are different:

 Multiple X^2 tests NEVER acceptable

• Risk of Type 1 error increases with each additional test! (almost guaranteed after 4‐5 tests)

 Bonferroni test of Inequality (Bonferroni correction)

  • Adjusts the p value for # of comparisons being made
  • Very conservative