Intro to Biostatistics Flashcards
Dependent Variable
Outcome you are measuring or looking for
Independent Variable
What is manipulated/changed during an experiment or study
Null Hypothesis (H0)
States there will be no true difference between the groups being compared
Alternative Hypothesis (H1)
States there will be a true difference between the groups being compared
Nominal Grouping
Dichotomous/binary; non-ordered, named categories; no order or magnitude, no consistency of scale or equal distances; simply labeled variables without quantitative characteristics
Ordinal Grouping
Ordered, rank-able categories; non-equal distance; they have order/magnitude but have NO consistency of scale or equal distances
Interval/Ratio Grouping
Ordered, magnitude, and equal distances/units; have order/magnitude AND consistency of scale/equal distances
Ex: Living siblings (number) and personal age (in years)
Interval: Arbitrary zero value (but 0 doesn’t mean absence)
Ratio: Absolute rational zero value (0 DOES mean absence of measurement value)
Which groups are considered “discrete” data?
Nominal and Ordinal
Which groups are considered “continuous” data?
Interval/Ratio
Mode
Most common number
Median
Middle number after numbers are placed in order
Mean
Average of all numbers
Minimum, Maximum, Range
Minimum = lowest Maximum = highest Range = difference between min and max
Variance
Average of the squared differences in each individual measurement value (x) and the groups’ mean
Standard Deviation
Square root of variance value
When a dataset is normally distributed, which values are equal or near equal?
Mean and Median
1 standard deviation is what percentage under the curve?
68%
2 standard deviations is what percentage under the curve?
95%
3 standard deviations is what percentage under the curve?
99.7%
Positively Skewed
When mean is higher than median; tail is pointing to the right
Negatively Skewed
When mean is lower than the median; tail is pointing to the left
Kurtosis
Measure of the extent to which observations cluster around the mean; for a normal distribution, the value of the kurtosis statistic is 0
Positive Kurtosis = more cluster
Negative Kurtosis = less cluster
Skewness
Measure of the asymmetry of a distribution
Required assumptions of Interval/Ratio data for the proper selection of a parametric test
- Normally distributed
- Equal variances (multiple tests available for equal variances between groups)
- Randomly-derived and independent
Levene’s Test
Test that tells us if data is normally distributed or not and if it has equal variances
How do you handle data that is NOT normally-distributed?
Use a statistical test that does not require the data to be normally distributed, such as ordinal or nominal tests, or transform data to a standardized value with the hope that this transformation allows data to be normally-distributed (may not work)
Type 1 Error
NOT accepting the null hypothesis when it is actually true and should have been accepted; there really is no true differences between the groups; also called “alpha”
Type 2 Error
Accepting the null hypothesis when it is actually false, and you should NOT have accepted it; there really IS a true difference between the groups being compared; also called “beta”
Power
1-beta; statistical ability of a study to detect a true difference, IF one truly exists between group-comparisons, and therefore the level of accuracy in correctly accepting/not accepting the null hypothesis
Sample Size
The larger the sample size, the greater the likelihood (ability) of detecting a difference if one truly exists; also increases power
P Value
Statistical tests determine possible error-rate or likelihood of chance in comparing difference or relationship between variables
Confidence Interval (CI)
Most common selections are 90, 95, or 99%; calculated at an a priori percentage of confidence that statistically includes the real (yet unknown) difference or relationship being compared; based on variation in sample and sample size
Interpretation of a 95% CI
We are 95% confident that the “true” difference or relationship between the groups is contained within the confidence interval range
What does it mean when a CI crosses 1.0 for ratios or 0.0 for absolute differences?
Means that CI is NOT significant
Does “statistical” significance always confer meaningful, “clinical” significance?
No
Correlation (r)
Provides a quantitative measure of the strength and direction of a relationship between variables; values range from -1.0 - 1.0
Partial Correlation
A correlation that controls for confounding variables
What is the name of the nominal correlation test?
Contingency coefficient
What is the name of the ordinal correlation test?
Spearman correlation
What is the name of the interval correlation test?
Pearson correlation; for a pearson correlation, a p value of >0.05 means there is no linear correlation, but there MAY still be a non-linear correlation present
Survival Tests
Compares the proportion of events over time, or time-to events, between groups; commonly represented by a Kaplan-Meier Curve
What is the name of the nominal survival test?
Log-Rank Test
What is the name of the ordinal survival test?
Cox-Proportional Hazards Test
What is the name of the interval survival test?
Kaplan-Meier Test
Regressions
Provide a measure of the relationship between variables by allowing the prediction about the dependent, or outcome, variable (DV) knowing the value/category of independent variables (IVs); can also calculate OR for a measure of association
What is the name of the nominal regression test?
Logistic Regression
What is the name of the ordinal regression test?
Multinomial Logistic Regression
What is the name of the interval regression test?
Linear Regression
What are the 4 questions you should ask when selecting the correct statistical test?
- What data level is being recorded?
- What type of comparison/ assessment is desired?
- How many groups are being compared?
- Is the data independent or related/paired?
What is the name of the nominal test when comparing 2 groups of independent data?
Pearson’s Chi-Square Test
What is the name of the nominal test when comparing 3 or more groups of independent data?
Chi-Square Test of Independence
What is the name of the nominal test when comparing 2 or more groups of independent data that have an expected cell count of less than 5?
Fisher’s Exact Test
What is the name of the nominal post-hoc test?
Bonferroni Test of Inequality (Bonferroni Correction); adjusts p value for # of comparisons being made
What is the name of the nominal test when comparing 2 groups of related data?
McNemar Test
What is the name of the nominal test when comparing 3 or more groups of related data?
Cochran
What is the name of the ordinal test when comparing 2 groups of independent data?
Mann-Whitney Test
What is the name of the ordinal test when comparing 3 or more groups of independent data?
Kruskal-Wallis Test
What is the name of the ordinal test when comparing 2 groups of related data?
Wilcoxon Signed Rank Test
What is the name of the ordinal test when comparing 3 or more groups of related data?
Friedman Test
What are the names of the ordinal post-hoc tests?
Student-Newman-Keul, Dunnett, Dunn
Student-Newman-Keul Test
Compares all pairwise comparisons possible and all groups must be equal in size
Dunnett Test
Compares all pairwise comparisons against a single control and all groups must be equal in size
Dunn Test
Compares all pairwise comparisons possible and it is useful when all groups are not of equal size
What is the name of the interval test when comparing 2 groups of independent data?
Student T-Test
What is the name of the interval test when comparing 3 or more groups of independent data?
Analysis of Variance (ANOVA)
What is the name of the interval test when comparing 3 or more groups of independent data with confounders?
Analysis of Co-Variance (ANCOVA); compares the means of all groups against a dependent variable while also controlling for the co-variance of confounders
What is the name of the interval test when comparing 2 groups of related data?
Paired T-Test
What is the name of the interval test when comparing 3 or more groups of related data?
Repeated Measures ANOVA with 1 dependent variable
What is the name of the interval test when comparing 3 or more groups of related data with confounders?
Repeated Measures ANCOVA; compares the means of all groups against a dependent variable while also controlling for the co-variance of confounders
What are the names of the interval post-hoc tests?
Student-Newman-Keul, Dunnett, Dunn, Tukey or Scheffe, and Bonferroni Correction
Tukey/Scheffe Tests
Compares all pairwise comparisons possible and all groups must be equal in size; tukey test is slightly more conservative than the SNK; scheffe test is less affected by violations in normality and homogeneity of variances - most conservative
Kappa Statistic
Correlation test showing relationship or agreement between evaluators (consistency of “decisions” or “determinations”)
Interpreting a Kappa Statistic
+1 - observers perfectly “classify” everyone exactly the same way
0 - there is no relationship at all between the observers “classifications”, above the agreement that would be expected by chance
-1 - observers “classify” everyone exactly the opposite of each other