Statistics exam 4 Agresti Flashcards
Regression, non parametrics, ANOVA
What is ANOVA and when is it used?
Analysis of variance
Comparing quantitative response variables that have a categorical explanatory variable
What is the difference between a one- two- three-way ANOVA?
One: 1 independent variable in a between groups design
Two: factorial 2x2 design
Three: factorial design 2x3x3
What is the difference between variability between and within?
Between: distance between tops of distributions
Within: distance within a distribution
What does var. between > var. within mean?
There is a true difference between the groups
What type of distribution is used for ANOVA and how does it look?
F-distribution
- One right tail
- High F = small p value
What are the assumptions for an ANOVA test?
- Quantitative variable in more than 2 groups
- Independent random sampling
- Equal standard deviations (largest sd < 2x smallest sd)
- Normally distributed
- Equal n (for now)
What do the hypotheses for ANOVA look like?
H0 = mu1 = mu2 = …. mu g
HA = at least 2 population means are different
What are the steps for calculating F statistic in ANOVA test?
- Calculate within variability
- Calculate between variability
- Fill in in F statistic formula
How do you calculate the p-value in ANOVA testing?
1-F.DIST (F ; df1 ; df2 ; true)
What is the conclusion if p < alpha in ANOVA test?
At least 2 groups differ, but you don’t know which ones
What is MS and SS?
MS: mean squares = variability within and between
SS: sum of squares = MSg or MSe times the df1 or df2
What is the fisher method in ANOVA?
The confidence interval of ANOVA testing. If you have 3 groups, you have 3 intervals
This confidence interval is more narrow than the normal confidence interval for t distribution
Why would you use the fisher method and not doing three times the t-distribution?
It capitalizes on chance. By doing the test over and over again, the chance of a type I error (alpha) increases
What is the Bonferroni method?
Adviced alpha = used alpha / number of tests (K)
It corrects for capitalization on chance for doing t-tests over and over again
What is an alternative for the Bonferroni method?
Tukey method
When do you use non-parametric tests?
When central limit theorem isn’t met, because groups are too small. No normal distribution
How do you deal with ties in non-parametric tests?
Average the ranks the ties would get
What are the three types of non parametric tests and when do you use them?
- Wilcoxon: non parametric t test for comparing 2 means
- Kruskal Willis: non parametric anova test for between groups/factorial designs
- Sign test: for paired observations/ dependence/ paired t-test / pre-posttest design / matched individuals
What are the assumptions for the Wilcoxon test?
- Rank ordered
- 2 independent samples
- No assumptions regarding the distribution
What do the hypotheses for the Wilcoxon test look like?
H0: equal expected values for sample mean ranks and identical population distribution
H1: different expected values for sample mean ranks (two sided)
H1: higher/lower expected values for sample mean ranks (one sided)
What distribution can you use for samples larger than 20 in a Wilcoxon test? What do you have to do in other cases?
Use z distribution if n >20
In other cases: W = average (treatment) - average (control). Read the P-value from a sampling distribution
What is sample space in the Wilcoxon test? What is thought of these possibilities under H0?
All possible rank combinations.
All these possibilities are equally likely under H0
What distribution does the Kruskal-Wallis test use?
Chi square distribution
What are the assumptions for the sign test?
- Small n, not normally distributed
- Random sampling
- Unequal values for each pair (no equal pre/posttest values)