Wk 11: Comparing Multiple Groups Flashcards

1
Q

What is a parmetric test?

A

T test is based on estimating parameters from the sample (e.g. sample mean).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a non-parametric test?

A

compares the ranks of values instead of the values themselves.

  • Comparing ranks can be a more robust approach, just like the median is less affected by outliers than the mean.
  • Ordinal data (e.g. survey results) should use nonparametric tests.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the Mann-Whitney U Test?

A

nonparametric version of the independent-samples T test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the Mann-Whitney U Test test?

A

whether two distributions are the same.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the null hypothesis in the Mann-Whitney U Test?

A

No difference between the means of two groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the assumptions in the Mann-Whitney U Test?

A

The two distributions have the same shape and scale. But it does not assume the two distribution to have the same location.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What can affect the Mann-Whitney U Test? What can be done?

A

Outliers affect the means. Can remove outliers and run Mann-Whitney U test again.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the intepretation of the Mann-Whitney U Test?

  • sig.
  • p>0.05
  • p<0.05)?
A
  • Sig. is P value.
  • P >0.05 is insignificant, which supports null hypothesis.
  • P <0.05 is significant, which supports “effect” hypothesis.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is ANOVA?

A

uses F-tests to test the equality of means.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are 4 “steps” in ANOVA?

A
  1. First, measure the total variability in the response.
  2. Second, look at the variability within each group.
  3. If the within-groups variability is less than the total variability, then it suggests that knowing which group a person belonged to has given some information about them.
  4. This reduction in variability is called the between-groups variability.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does ANOVA stand for?

A

Analysis of Variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are 3 assumptions in ANOVA?

A
  1. Independent groups: Check study design
  2. Normal variability between groups: Check data, especially important for small samples.
  3. Equal variance between groups: Check Levene’s test, boxplots
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are _____ in F tests?

  • null hypothesis
  • F statistic
  • P-value
A
  1. Null hypothesis: All groups have the same mean.
  2. F statistic measures how different the groups are, relative to their variability.
  3. P-value is the probability of getting an F statistic as observed if the null hypothesis is true..
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is R2?

A
  • Total variability is the sum of squared standard deviation under the proposed model vs. under default explanation
  • R2 tells you the percentage of variability that is due to the proposed model.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the interpretation of ANOVA?

A
  1. First, look at test of homogeneity of variance to check whether the two groups have equal variance.
    • If P >0.05, then it supports null hypothesis (equal variance), so you can use ANOVA
    • If P <0.05, the it does not support null hypothesis (unequal variance), so you need to use Welch’s ANOVA rather than normal ANOVA.
      • Or you can transform the data using log to get equal variance.
  2. Second, look at ANOVA.
    • Sum of squares is total variability.
    • df is degrees of freedom (n-1).
    • Mean square = sum of squares / df
    • F distribution is the ratio of sample variance (mean square between groups / mean square within group)
    • P >0.05 suggests insignificant difference between groups means.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What do you first look at in interpretation of ANOVA?

A

First, look at test of homogeneity of variance to check whether the two groups have equal variance.

  • If P >0.05, then it supports null hypothesis (equal variance), so you can use ANOVA
  • If P <0.05, the it does not support null hypothesis (unequal variance), so you need to use Welch’s ANOVA rather than normal ANOVA.
    • Or you can transform the data using log to get equal variance.
17
Q

What do you look at second in interpretation of ANOVA?

A

Second, look at ANOVA.

  • Sum of squares is total variability.
  • df is degrees of freedom (n-1).
  • Mean square = sum of squares / df
  • F distribution is the ratio of sample variance (mean square between groups / mean square within group)
  • P >0.05 suggests insignificant difference between groups means
18
Q

What are 2 things that this show (bars)?

A
  • Difference between bars (1st set of bars) suggests significant variability
  • Similar bars (3rd set of bars) suggests no significant variability
19
Q

What do these box plots show compared to bars?

A

Side by side boxplots show the variability even better.

20
Q

What are 3 steps in the interpretation of the Wlech’s ANOVA?

A
  1. First, look at test of homogeneity of variance to check whether the two groups have equal variance.
    1. P <0.001 is very strong evidence of unequal variance, so you use Welch’s ANOVA
  2. Skip the ANOVA table because you cannot use it.
  3. Second, look at Welch’s ANOVA.
    • P <0.001 is very strong evidence of difference in means between the groups.
21
Q

What is normal variability? Why is it important? Small vs large sample?

A
  1. Checking normal variability is especially important for small samples.
  2. If the populations are non-normal, then the type 1 error rate might be inflated, where you will incorrectly reject the null hypothesis more frequently than you should.
    • Address this by making threshold for evidence lower or increasing sample size.
  3. In larger samples, skews do not matter as much because it will approach normally distribution (central limit theorem).
22
Q

What is a residual?

A

A residual is a prediction error, or in other words, the difference between the observed response and the response predicted by our model (e.g. mean response).

  • e.g. One participant had a score of 9. The mean score is 5. So the residual is 9-5 = 4.
23
Q

What are 2 reasons for using residuals?

A
  1. check assumptions
  2. estimate within-group variability
24
Q

What does black line and blue bubbles mean?

A
  • Black line = mean
  • Blue bubbles = residual
25
Q

What does this mean?

A

Clear difference in variability between residuals

26
Q

The assumption of “normal variability” means the same as the_______. How can we check this?

A
  • Residuals having a normal distribution (with common standard deviation)
  • We can check the normality by making a normal Q-Q plot
27
Q

What does this show?

A

Straight line = yes normality

28
Q

What is the purpose of the demographics table?

A

The two groups should ideally have identical characteristics before intervention, so that we can conclude any difference in post-intervention measurements is due to interventions, not due to underlying characteristic.