Midterms material Flashcards
What are the null hypothesis and alternative hypothesis for a one-way ANOVA?
H0 :μ1 =μ2 =μ3
H1 : Not all μ’s are the same
What’s a factor in a one-way ANOVA?
the independent variable
What are the levels in a one-way ANOVA?
The different groups/treatment and control conditions
What are the assumptions of a one-way ANOVA?
- The population distribution of the DV is normal within each group
- The variance of the population distributions are equal for each group (homogeneity of variance assumption)
- Independence of observations
What’s the familywise Type 1 error rate?
The probability of making at least one Type 1 error in the family of tests if the null hypotheses are true
What’s a family of tests?
a set of related hypotheses
What does the Overall F-test or first test of ANOVA tell us?
- Overall F-test evaluates is H0 false?
- If the overall F-test is significant then we use post-hoc tests to look at pairs of groups
What kind of ratio does ANOVA give us?
- F ratio
- ANOVA gives us a ratio of variance due to group membership over variance that is not explained by group membership (MSm divided by MSr)
What is variance explained by the model (MSm)?
Between-group variance that is due to the IV, or different treatments/levels of a factor -> variance accounted for by group membership
What is residual variance (MSr)?
- Within-group variance that can’t be accounted for by group membership
- Within each group, there is some random variation in the scores for the subjects
How are the F statistic and degrees of freedom presented?
F (dfM, dfR) = x
What kind of distribution is the F distribution?
A right-skewed distribution used most commonly in ANOVA
When can you reject the null hypothesis in an ANOVA test?
If your F value is greater than or equal to the critical value, you may reject the null hypothesis
How does the F ratio relate to the t statistic?
- With only two groups, either a t test or an F test can be used for testing for a significant difference between means
- Both procedures lead to the same conclusion
- When the number of groups is 2, then F = t^2
In ANOVA formula, what does X-bar stand for?
The grand mean (across all observations)
In ANOVA formula, what does i stand for?
An observation (coming from N total observations)
In ANOVA formula, what does g stand for?
A group
In ANOVA formula, what does k stand for?
Total number of groups
In ANOVA formula, what does Ng stand for?
Size of group g
In ANOVA formula, what does Xbar-g stand for?
Group mean
In ANOVA formula, what does Xig stand for?
Xig - observation i in group g
What does SSt stand for?
The aggregate variation/dispersion of individual observations across groups
What are MST , MSM , and MSR often called?
the total, model (between-group), and residual (within-group) Mean Squares, respectively
Which effect size is more commonly reported in ANOVA?
η2 (eta squared)
What do the effect sizes (pearson R, eta squared and omega squared) all look for?
Proportion of variance in the DV that is explained by the IVs
What’s the difference between
eta squared and omega squared?
- η2 is positively biased (overestimates the amount of variance explained in the DV by the IVs)
- ω2 is unbiased
What are the cut-offs for the effect size of
omega squared?
- Small ≈ .01
- Medium ≈ .06
- Large ≈ .14
- Report ω2, even if it’s negative
What does fully-crossed mean in a factorial design?
That the factor levels are multiplied by each other (ex: factor 1 has 3 levels and factor 2 has 3 levels then it’s a 3x3 factorial design with 9 treatment conditions)
What elements should be included in the APA style analysis conclusion (in order)?
- 1-2 sentence overview of analyses that includes the independent and dependent variable, stated conceptually.
- Description of overall results of F -test, in a particular format, including effect size measure
- Description of the pattern of mean differences among groups, including whether significant differences were found (M for mean and SD for standard dev) -> when working with 3 groups ANOVA test, we’ll have to conduct post-hoc tests to evaluate which pairs of groups have significant mean differences
- A conceptual conclusion
Provide an example of what elements should be included in the APA style analysis conclusion (in order)?
- To investigate whether level of fitness (low versus high) had an effect on ego strength (with higher scores indicating more ego strength), we conducted a one-way between-subjects ANOVA
- This analysis revealed a significant effect of fitness on ego strength,
F (1, 8) = 5.32, p < .05, ω2 = .61 - Participants in the low fitness group (M = 4.40, SD = 0.92) had significantly lower ego strength than those in the high fitness group (M = 6.36, SD = 0.55)
- We conclude that having high as opposed to low fitness may increase ego strength
How to report numbers in APA format?
- 2 decimal places
- 3 decimal places for p-values
True or False: with two groups the results of an independent samples t-test and a between-subjects ANOVA on the same data set will always agree
FALSE: they could disagree they use a different value of α
What are assumptions of a single mean z-test?
- The variable, X, in the population is normally distributed
- The sample must be a simple random sample of the population (independence of observations)
- The population standard deviation, σ, must be known
What are the effect size cut-offs for r?
0.10 -> small effect
0.30 -> medium effect
0.50 -> large effect
What does a 95% Confidence interval mean?
If we repeated our experiment many times, 95% of the time a 95% CI will contain the true effect
What does the p-value represent?
The p-value represents the proportion of data sets that would yield a result as extreme or more extreme than the observed result if H0 is true
What are the effect size cut-offs for r squared?
0.01 -> small
0.09 -> medium
0.25 -> large
What are the effect size cut-offs for cohen’s d?
0.2 -> small
0.5 -> medium
0.8 -> large
What are the assumptions in between subjects ANOVA?
- Independence of observations
- Identical distribution (within group)
- Identical distribution (between groups)
- Homogeneity of variance
- Normal Distribution
Describe the formula Yij =μ+αj +Eij
- Formula describing the linear model underlying everything we do in ANOVA
- Yij = person i’s score on the outcome Y and this person i belongs in group j -> Y is the dependant variable
- Eij -> experimental error - something that allows individual scores of people in that population to vary from this group mean (assumed to be normal)
- Eij is random, but mu + alpha-j is fixed for every member of that population
- In this equation, mu + alpha-j is constant for every person in the population (one population = one mean)
The assumptions about normality and equal variances are assumptions about what?
- The population
- Usually we can examine the sample for evidence about whether these assumptions hold
What are some methods for Assessing Normality?
Descriptive and Inferential Statistics:
- Looking at the mean, median, mode
- Tests for skewness (testing whether skewness is significant -> normal distribution has skew of 0, any type of skewness means that the distribution isn’t perfectly normal)
- Kolmogorov-Smirnov and Shapiro-Wilk tests
Visual methods:
- Histograms
- Normal Quantile (Q-Q) Plot
Describe tests for skewness when assessing normality
- Skewness represents symmetry and whether the distribution has a long tail in one direction
- Left (negative) skew = Mean < Median
- Symmetric (normal) = Mean = Median
- Right (positive) skew = Median < Mean
- Skewness should be ~0
> 0 - positive/right skew (longer right-hand tail)
< 0 - negative/left skew (longer left-hand tail) - Also look at standard errors (SE skewness)
- Conducting a significance test for whether skewness is significantly different from 0
- To compute this, we will get an estimate of skewness of our variable, divided by the standard error, and then compare this against a value of 3.2 in absolute value
- Reject the null hypothesis that skew is 0 in the population if the ratio tskewness is greater than 3.2 in absolute value
- Here we don’t want to reject the null hypothesis because rejecting it would mean we have found evidence that our scores aren’t normally distributed
What’s the more unbiased estimate of central tendency?
Median, rather than the mean
What are the statistical tests of normality?
- The Kolmogorov-Smirnov (K-S) test
- The Shapiro-Wilk (S-W) test
- If a test is significant, reject the null hypothesis that the distribution of the variable is normal
What’s the Kolmogorov-Smirnov (K-S) test?
- Very general, but usually less power than Shapiro-Wilk (S-W) test
- Conceptually, compares sample scores to a set of scores generated from e.g., a normal distribution with the sample mean and standard deviation
- Used to see if the scores on your variable follow any distribution you think they follow
- Conceptually, this test takes your observed scores on the variable and it compares them to quantiles from this reference distribution you’re trying to assess whether it’s appropriate for your data
- If there are large departures from the quantiles from the reference distribution and your observed scores -> this would be evidence against your scores following the distribution you think they follow
What’s the Shapiro-Wilk (S-W) test?
- Usually more powerful, but only for normal distributions
- Follows a similar logic to the Kolmogorov-Smirnov (K-S) test
What are limitations of the normality tests and solutions to overcome these?
- It’s easy to find significant results (reject null hypothesis that data is normal) when sample size is large
- Same with skewness tests -> as the sample size gets larger, SE gets smaller and with smaller SE, you’re more likely to get a t ratio value larger than 3.2, even with small values of skewness
- Solution: do the tests, but plot data as well and examine the histogram for evidence of multimodality, extreme scores (outliers), and asymmetry
- More than one mode is evidence of deviation from normality
Describe the use of histograms to assess normality
- Create separate histograms for each group to assess normality
- Look for obvious signs of non-normality
- Doesn’t have to be perfect, just roughly symmetric
- Multiple modes may suggest that there are different subpopulations in the sample
- If that’s the case, include a classification variable as an additional factor in the ANOVA
Describe the use of normal quantile plot (or normal probability plot or Normal Q-Q plot) to assess normality
- Compute percentile rank for each score
- Sort observations from smallest to largest
- What percentage of scores are below score X? - Calculate (theoretical or expected) z-scores from percentile rank
- If the scores were normal, what would the z-score be?
3 Calculate actual z-scores
4 Plot the observed vs. theoretical z-scores
- We get some percentiles from the z-distribution and we see how much our observed z-scores deviate from the percentiles from the normal distribution
- If the data are close to normal, then the points will like close to a straight line
What do violations of the assumption of normality lead to?
- Non-normality tends to produce Type I error rates that are lower than the nominal value
- Depending on the context of the research study, this may be less concerning than an assumption violation that results in excessive Type I error rates (above the nominal value α)
- When we select an alpha of say .05, we’re saying that if the null hypothesis is true, 5% of our findings in the long run will be false positives
- If you don’t meet the assumption of normality and you pick an alpha level of .05 -> less than 5% of your results in the long run will be false positives if the null hypothesis is true
- This means you have lower power to detect differences if there is an effect in the population
- A consequence of the violation of the assumption of normality is that you might miss some effects (not inflating type 1 error rate but you are decreasing your power)
Type 1 error rate and what go hand in hand?
Type 1 error rate and power go hand in hand (as one increases so does the other)
What’s the assumption of homogeneity of variance?
Assuming that all of the group variances are equal
What does violation of the assumption of homogeneity of variance lead to?
- Serious violation of this assumption tends to inflate the observed value of the F statistic
- Too many rejections of H0 = high Type I error
- This is a more problematic assumption because if you violate this assumption, you will inflate your type 1 error rates
- If you select an alpha of .05, but your assumption of homogeneity of variance is not met, you may end up with more than 5% of false positives if the null hypothesis is true
What are the different tests that assess homogeneity of variance?
- The Fmax test of Hartley
- Levene’s test
- Brown and Forsythe test
What’s the Fmax test of Hartley?
- Fmax = ratio of largest group variance to the smallest group variance
- Calculate the sample variance for each group, and find the largest and smallest variances
- Compute Fmax:
Fmax = maxs2g mins2g′ - The observed Fmax value is compared against a critical value of this statistic
- If the assumption of homogeneity of variance is satisfied, Fmax ratio would be close to 1
- If the observed value of Fmax exceeds the critical value, we conclude that we have to reject the null hypothesis and the assumption is not met
- Easy to compute, but assumes that each group has an equal number of observations
What’s Levene’s test?
- Measures how much each score deviates from its group mean
Zij =|Yij −Ybarj| - Instead of using the original scores Yij to run the ANOVA, you use the absolute deviation scores Zij
- If we retain the null hypothesis, we can conclude that the assumption of homogeneity of variance is met
- The downside of this test is that it’s easier to obtain a significant F-ratio for this ANOVA when your sample size is large
What’s the Brown-Forsythe test?
- It measures how much each score deviates from its group median
- The median is less weighed by outliers than the mean and isn’t pulled by a skewed variable
- Zij =|Yij −Mdj|
- Instead of using the original scores Yij to run the ANOVA, you use the absolute deviation scores Zij
- For both the Levene and Brown-Forsythe tests a statistically significant finding (e.g., p ≤ .05) leads to the conclusion that the variances are significantly different across groups (i.e., the assumption of homogeneity of variance is not met)
- The Brown-Forsythe test is slightly more robust than Levene’s test
For both the Levene and Brown-Forsythe tests a statistically significant finding (e.g., p ≤ .05) leads to what conclusion?
That the variances are significantly different across groups (i.e., the assumption of homogeneity of variance is not met)
Which test is recommended more than the other: Brown-Forsythe test or Levene’s test?
Brown-Forsythe test is recommended over the Levene’s test
What are the 5 assumptions in ANOVA?
- Independence of observations (random sampling)
- Identical distribution (within groups) (random sampling)
- Identical distribution (between groups)
- Homogeneity of variance
- Normal distribution
What kind of statistical test only has one mean?
- z-test
- One-sample t-test
What kind of statistical test has 2 means and one factor?
Independent samples t-test
What kind of statistical test has more than 2 means and one factor?
One-way ANOVA
What kind of statistical test has more than 2 means and 2 factors?
Two-way ANOVA
What are the null hypotheses we need to find with a 2-way ANOVA?
Main effect of Factor A
Main effect of Factor B
Interaction between Factor A and B
What would a 3x4 set-up mean for a 2-way ANOVA?
3 levels in Factor A
4 levels in Factor B
What are factorial designs?
- Factorial designs are those in which factors are completely crossed
- They contain all possible combinations of the levels of factors
Ex: when each factor has 3 levels, it is called a 3 × 3 factorial design, resulting in 9 treatment combinations
What does it mean for a design to be fully crossed?
Every level of factor A is combined with every level of factor B
What’s a balanced design?
When sample sizes are equal in each condition
What do Factors represent?
The independent variables
What’s represented in the cells of the 2 × 2 factorial design of a 2-way ANOVA?
Means of all subjects within each cell are displayed
How many effects comprise a two-way factorial experiment?
2 main effects
An interaction effect
What are main effects?
- The effect of one factor when the other factor is ignored (by averaging the means over all levels of the other factor)
- Consists of the differences among marginal means for a factor
What’s the interaction effect?
- The extent which the effect of one factor depends on the level of the other factor
- An interaction is present when the effects of one factor on the DV change at different levels of the other factor
- The presence of an interaction indicates that the main effects along do not fully describe the outcome of a factorial experiment
- Sometimes called a crossover effect
- Considers pattern of results for all cell means
How could you visualize a main effect for Factor A on a plot?
There’s a main effect if there is a difference in average of the 2 dots (coming from both levels) closest to each other -> on both sides of slope
How could you visualize a main effect for Factor B on a plot?
There’s a main effect if the average of both lines (slopes) are different
How could you visualize an interaction effect
between Factor A and Factor B on a plot?
There’s no interaction if lines are parallel
There’s an interaction if they aren’t parallel and indicate that they’ll eventually cross over
The 2-way ANOVA statistically examines the effects of what?
- 2 factors of interest on the DV
(Main effects) - Interaction between the different levels of these 2 factors
(Interaction effect)
What are assumptions of the 2-way ANOVA?
- The population distribution of the DV is normal within each group
- The variance of the population distributions are equal for each group (homogeneity of variance assumption)
- Independence of observations
State the hypotheses for main effects
- Main effect of Factor A
H0A : μA1 = μA2 = ··· = μAa (equal row marginal means)
H1A : Not all μAg are the same - Main effect of Factor B
H0B : μB1 = μB2 = · · · = μBb (equal column marginal means)
H1B : Not all μBj are the same
State the hypotheses for interaction effect
- Hypotheses for interaction effect
H0: A×B : All μAgBj are the same OR The interaction between Factor A and Factor B is equal to zero
H1: A×B : Not all μAgBj are the same OR The interaction between Factor A and Factor B is NOT equal to zero
How is total sums of squares (or total variation) partitioned in 2-way ANOVA?
- It’s divided into 2 parts:
SST = SSM +SSR
SSM = Model (Between-group) variation
SSR = Residual (Within-group) variation
How is sums of squares of the model (SSM) partitioned in 2-way ANOVA?
SSA: Variation between means for Factor A
SSB: Variation between means for Factor B
SSA×B: Variation between cell means
What’s the formula for the F ratio for Factor A in 2-way ANOVA?
FA = MSA / MSR
What’s the formula for the F ratio for Factor B in 2-way ANOVA?
FB = MSB / MSR
What’s the formula for the F ratio for the interaction of Factor A and Factor B in 2-way ANOVA?
FA×B = MSA×B / MSR
When should you reject the null hypothesis in a 2-way ANOVA?
If each observed F value is greater than or equal to its critical value
What does “a” stand for in 2-way ANOVA?
Number of levels for Factor A
What does “b” stand for in 2-way ANOVA?
Number of levels for Factor B
What are treatment sums in 3-way ANOVA?
Sum of raw scores in each treatment group
What’s the grand sum (T) in 3-way ANOVA?
The sum of all the scores in the experiment
How many total effects do we have to compute/find with a 3-way ANOVA?
- 7 effects
- 3 main effects (A, B, C)
- 3 simple (two-way) interactions (AxB, AxC, BxC)
- 1 three-way interaction (AxBxC)
Describe the within-subjects/repeated- measures design
- An experimental design in which the DV is measured several times within the same subject
- Subjects are crossed with at least one experimental factor
- The simplest design of this kind may be a before and after-treatment design (2 conditions)
What is a one-way repeated-measures design comprised of?
- Levels of Factor A (only has 1 factor)
- Subjects (participants)
Describe the one-way repeated measures design
n subjects are measured on the DV under k conditions (or levels) of a single IV or factor
What are possible research questions with the one-way repeated measures design?
- Are there differences in the mean scores of the DV across groups/conditions?
- Within-subject effect of the independent variable (each subject is measured at each time point) -> Variation due to the model
- Are there differences across subjects?
- The variability of subjects (between-subject effect)
- Treat each participant as a different level in an experimental design
What’s the null hypothesis for the between-subject effect in the one-way repeated measures design?
H0: Vs = 0
- this effect represents the variance between subjects
What’s the null hypothesis for the within-subject effect of treatment (IV) in the one-way repeated measures design?
H0: μ1 = μ2 · · · = μk
What are we interested in in a One-way repeated measures ANOVA?
- Usually we are NOT interested in the effect of ‘subjects’ or subject-level variability
- If this effect is significant, it would simply tell us that subjects differ on the dependant variable which has nothing to do with our treatment (IV) so it’s irrelevant
- What we are really interested in is whether the IV has an effect on the subjects, regardless of whether differences existed naturally among the subjects
What’s referred to as SS error in one-way repeated measures ANOVA?
SSaxb
SS within is composed of what 2 types of SS in one-way ANOVA?
SS(S) and SS(AxS)
What are the assumptions in one-way repeated measures ANOVA?
- Normality
- Homogeneity of variance
- Homogeneity of covariance
Describe the normality assumption in one-way repeated measures ANOVA
The distribution of observations on the dependent variable is normal within each level of the factor
Describe the homogeneity of variance assumption in one-way repeated measures ANOVA
The population variance observations is equal at each level of the factor
Describe the homogeneity of covariance assumption in one-way repeated measures ANOVA
The population covariance between any pair of repeated measurements is equal (homogenous covariance)
What 2 assumptions in the one-way repeated measures ANOVA are considered compound symmetry?
- Homogeneity of variance
- Homogeneity of covariance
Describe the assumption of compound symmetry in one-way repeated measures ANOVA
We assume that the variations within experimental conditions is fairly similar and that no 2 conditions are any more dependent than any other two
Describe the assumption of sphericity in one-way repeated measures ANOVA
- Given that hypotheses about treatment effects are tested on differences between scores, the assumption of compound symmetry can be replaced by the assumption of sphericity (or circularity)
- Sphericity means that the variance of differences of a pair of observations is the same across all pairs
- In the assumption of sphericity, we assume that the relationship between pairs of experimental conditions is similar
- This assumption is tested in practice, and it is a necessary condition for validity of the F test in repeated measures ANOVA
What happens when there’s a violation of sphericity in one-way repeated-measures ANOVA and how do we deal with it?
- Use of tests for violations of sphericity, such as Mauchly’s W (1940) - Mauchly’s test
- When Compound Symmetry is violated, the omnibus Ftests in one-way repeated measures ANOVA tend to be inflated, leading to more false rejections of H0
- Violations of CS require adjustments to the F test
- We can use a conservative critical value based on the possible violation of sphericity (conservative Ftest)
- The inflation of the F statistic that occurs when sphericity is violated can be adjusted by evaluating the observed F value against a greater critical value, obtained by reducing the degrees of freedom
- Some of the most popular approaches involve:
1. measuring the degree of violation of sphericity
2. using the critical value equal to the value of the F distribution that corresponds to εdf (the adjustment is made for both df numerator and df denominator)
What’s the formula for finding the conservative critical value?
DF(B) = epsilon x (k-1)
DF(BS) = epsilon x (k-1)(n-1)
What’s the function of epsilon?
It measures the extent to which sphericity was violated
What’s the criteria for determining a violation of sphericity in one-way repeated-measures ANOVA?
- When sphericity holds, epsilon = 1 (i.e., no correction is needed).
- When sphericity is violated, 0 < epsilon < 1
- This reduces both DF(B) and DF(BS), and gives a larger critical value for F
- The further the epsilon value is from 1, the worse the violation
How do we decide the value of epsilon?
- Epsilon ≥ 1/(k-1)
- JASP provides two estimates of epsilon: Greenhouse- Geisser & Huynh-Feldt estimates
What’s the difference between the Greenhouse-Geisser & Huynh-Feldt estimates?
Greenhouse-Geisser is smaller (more conservative)
What’s the effect size for one-way repeated-measures ANOVA?
Partial Omega squared (ω2) that excludes the variability due to differences between subjects (MSS)
What’s the formula for within-participant variation in one-way repeated-measures ANOVA?
SSW =SSA+SSAxS
What’s the F-ratio for one-way between subjects repeated measures ANOVA?
F = MSA / MSsxa
When should you reject the null hypothesis from an F value?
If the observed F value is greater than or equal to its critical value, reject the corresponding null hypothesis
If there’s a significant result p is < or > than .05?
p < .05
If there isn’t a significant result p is < or > than .05?
p > .05
Describe sphericity in one-way repeated-measures ANOVA
Variance of difference scores is equal for all pair-wise comparisons
What do the null and alternative hypotheses indicate in Mauchly’s test
- The null hypothesis in Mauchly’s test is that the assumption of sphericity is met
- Rejecting the null hypothesis indicates that the assumption of sphericity is violated
When sphericity is violated, using the F ratio with unadjusted degrees of freedom leads to what?
- An increase in Type I error rates (false positives)
- When sphericity is violated, the type 1 error rate is no longer .05 but it is greater
What are the 3 possible ways to calculate epsilon?
- Greenhouse-Geisser approach
- Huynh-Feldt approach
- Minimum possible value epsilon can attain which is ε = 1/(a − 1)
What’s the preferred and most generally used adjustment for violation of sphericity
- Huynh-feldt
- Because it tends to have the highest power
- The adjustment we usually use when reporting the results in APA format
What values change when we use the adjustments for violation of sphericity?
- df values
- MS values (since they’re calculated with df)
What values don’t change when we use the adjustments for violation of sphericity?
- SS values
- Observed F ratios -> because if you multiply both sets of degrees of freedom by epsilon, those 2 adjustments cancel each other out so you end up with the same observed F ratio
What test results should always be reported first in an APA summary for a one-way repeated-measures ANOVA?
- (when applicable) Mauchly’s test should always be reported first in APA summary
Why is epsilon denoted as 0 < epsilon < 1?
Epsilon can’t be zero because the formula always includes a minimum of a=2 (2 levels since repeated measures needs a minimum of 2 levels) so the epsilon formula can’t give 0
What measure of effect size do we use for 2-way ANOVA?
Omega-squared (ω2)
In One-way ANOVA, SSR stands for what?
It’s the sum of the squared difference between a group mean and group observations, across all k groups
In Two-way ANOVA, SSR stands for what?
It’s the sum of squared differences between
a cell mean and cell observations, across all (a × b) cells
Describe Mauchly’s Test of Sphericity
- H0 in this test is “variances of differences between conditions are equal”
- If p < .05, the assumption of sphericity (and CS) is violated
- Available in JASP
How are the Greenhouse-Geisser and Huynh-Feldt values obtained?
- Greenhouse-Geisser was obtained with (a-1) x epsilon
- Huynh-Feldt was obtained with 2 x epsilon