Miller-Stats Flashcards

Question

what statistical test do you use for categorical data?

Answer 1

• Chi-square (χ2) test • Used for two or more groups of categorical data • Example: to compare treatment A versus B when the outcome is either “satisfied or unsatisfied,” the chi-square test can be used to identify relationships between “treatment condition” and “outcome category.” • If the result of the test is statistically significant, frequencies of each outcome in the two treatment groups can be visually compared to describe which treatment is superior. • Fisher exact test • Similar to the chi-square test but better for small sample sizes or when the number of occurrences in one of the categories is low (e.g., if only one patient in treatment group A had an unsatisfactory outcome, this test is preferred)

Answer 2

When two groups of data are compared, the t-test is used; there are two variations: • Dependent (paired) samples t-test: • Appropriate for comparing continuous, normally distributed data collected two times on the same subjects • Example: two time points measured in the same patient (e.g., before/after intervention) • Also appropriate for side-by-side comparison within the same subject or in matched pairs of subjects • Nonparametric equivalent: Wilcoxon signed rank test. • Independent samples t-test • Appropriate for comparing continuous, normally distributed data from two separate groups • Example: two groups of patients who received different treatments • Nonparametric equivalent: Mann-Whitney U test • ANOVA is appropriate to compare three or more groups of continuous, normally distributed data. • Nonparametric equivalent: Kruskal-Wallis test • Repeated measures ANOVA is a variation of the ANOVA test that is appropriate for sequential measurements recorded on the same subjects. • For example, this test would be used to compare a dependent variable (outcome measure) recorded at three or more time points (baseline, 1 month post intervention, 2 months post intervention). • Nonparametric alternative: Friedman test • Multivariate ANOVA (MANOVA): variation of the ANOVA test that is used when multiple dependent variables are compared among three or more groups • Analysis of covariance (ANCOVA) is an appropriate test when confounding factors must be accounted for in the statistical test. • Post hoc testing is necessary after any ANOVA test to determine the exact location of differences among groups. • ANOVA tests describe whether or not a statistically significant difference exists somewhere among the study groups. • For example, in a comparison of three levels of the independent variable treatment condition (A, B, or C), post hoc testing will specifically compare A vs. B, B vs. C, and A vs. C to determine the exact locations of group differences. Post hoc testing is appropriate only if the ANOVA test is statistically significant (see later section). • Common post hoc tests: Tukey HSD, Šidák, Dunnett, Scheffe • Factorial designs for multiple independent variables • Hypotheses regarding an interaction among three different treatment groups from pre/post intervention will have a 2 × 3 factorial design. • “2 × 3” indicates two independent variables; for example, the first (time) has two levels, pretest and post test, and the second (treatment condition) has three levels, treatments A, B, and C.

Answer 3

Correlation coefficients • Describe the strength of a relationship between two variables • Pearson product correlation coefficient (r) used for continuous normally distributed data • Spearman rho correlation coefficient (ρ) is the nonparametric equivalent. • Values range from −1.0 to 1.0; less than ±0.33 are “weak,” between ±0.33 and ±0.66 are “moderate,” and more than ±0.66 are “strong.” Positive values are direct relationships; negative values are indirect relationships. • Positive correlation coefficients indicate direct relationships suggesting that patients who scored high on one scale also score high on the other. • Negative correlation coefficients indicate inverse/indirect relationships suggesting that patients who score high on one scale score low on the other.

Answer 4

Simple linear regression • Describes the ability of one independent (predictor) variable to predict a dependent variable (outcome) variable • The coefficient of determination (R2) is the square of r (Pearson product correlation coefficient) and indicates the proportion of variance explained in one variable by another. • R2 ranges from 0 to 1.0, in which higher values indicate more variance explained. • Multivariate linear regression describes the ability of several independent variables to predict a dependent variable. • Logistic regression is used when the outcome is categorical and the predictor variables can be either categorical or non–normally distributed continuous data.

Answer 5

Can be assessed using statistical techniques similar to correlation coefficients □ The intraclass correlation coefficient evaluates agreement between two measures on the same scale. ▪ Accuracy/validity □ An instrument or test with the ability to accurately describe truth/reality is said to be valid. □ A validation study is designed to compare measures recorded from a gold-standard method with a new or experimental method. The data should be on the same measurement scale to determine agreement between the two instruments or techniques. ▪ Precision/reliability □ The ability to precisely describe a characteristic with repeated measurements can be tested statistically. □ The precision of an instrument or technique can be tested for interobserver (measures taken by different examiners on the same patient) or intraobserver (reliability of measures recorded by the same examiner at consecutive times) reliability. Measures should be on the same scale to determine agreement. ▪ The intraclass correlation coefficient (ICC) is a common statistical method for statistically testing the agreement between two sets of data. Values range from 0 to 1.0 (1.0 = perfect accuracy/precision). ▪ For binary or categorical data, a κ (kappa) statistic can be used to determine agreement. The κ statistic has the same scale (0 to 1.0) as the ICC.

Answer 6

In the interpretation of a statistical test result, it is important to establish whether or not your findings (e.g., a difference or relationship) were due to chance. It is also extremely important to determine whether your findings have clinical importance. □ Probability values (P values) • Inferential test statistics (t-statistic, F-statistic, r coefficient, etc.) are accompanied by a probability (P) value. These values are expressed on a 0% to 100% scale and indicate the probability that the differences/relationships among study data occurred by chance. □ P values less than 0.05 mean there is less than a 5% chance that the observed difference/relationship has occurred by chance alone and not through the study intervention. • A test is identified as statistically significant if the P value is 0.05 or less (willing to commit type I error 5/100 times). • Note: decision regarding the threshold for defining statistical significance is arbitrary, but this amount of error (alpha or type I error [see later]) is generally accepted. • Therefore, on the basis of P value, we either reject the null hypothesis, which stated that there were no differences or that no association existed (i.e., P \<0.05) or fail to reject the null hypothesis (P \>0.05). □ Bonferroni correction to the P value: • Adjusted threshold for statistical significance when performing multiple t-tests for each of several dependent (outcome) variables (used to protect against type I error that may occur) • Calculated as 0.05/k where k is the number of comparisons being made • For example, when two groups are compared using a t-test for each of three outcome variables, the t-test is statistically significant only if the P value is less than or equal to 0.05/3 = 0.017.

Answer 7

Minimal clinically important differences (MCIDs) is a method to describe the importance of an observed difference during a statistical test. □ MCIDs describe the smallest change in a patient-oriented outcome measure that would be perceived as being beneficial to the patient or would necessitate treatment. □ Many of the more commonly used patient-oriented outcome instruments have research-established MCID values—or a change in outcome that would change the course of a disease or its treatment. □ Expert and experienced clinicians should also consider whether observed differences are important enough to change practice.

Answer 8

Effect size (e.g., Cohen’s d) is a standardized method of expressing the magnitude of differences between study groups or in subjects before and after treatment in the unit of the SD. (Effect size = 1 means that the mean difference equals the SD.) The larger the value, the greater the effect (e.g., of treatment). □ Calculated as the mean difference (e.g., between two treatment groups or from pre- and posttreatment) divided by the SD (typically SD pooled between groups or the SD of the reference/control group): • Interpretation of effect size: effect sizes greater than 0.8 are “large”; those less than 0.2 are “small” (between these values can be interpreted as “medium”). • Effect sizes are similar to percentage differences, except the denominator is the SD. Therefore datasets that are highly variable may have lower effect sizes even if the mean difference is high.

Answer 9

Type I error (alpha [α] error) • Probability that a statistical test is wrong when the null hypothesis is rejected (i.e., claiming that groups are different when they actually are not) • It is accepted that this may occur 5 times out of 100, so the probability value threshold for statistical significance is 0.05 or 5%. □ Type II error (beta [β] error) • Probability that a statistical test is wrong when failing to reject the null hypothesis (i.e., claiming that two groups are NOT different when they actually are) • It is accepted that this may occur up to 20% of the time.

Miller-Stats Flashcards

(35 cards)