RMC, W3 Flashcards by Anisah Islam

What is ANOVA?

• ANOVA = Analysis of variance
- What is variance? A measure of spread or dispersion > how spread out is our data in a dataset/population > statistically variance is standard deviation squared
• In summary, ANOVA compares naturally occurring variation called error to overall variation across groups (between groups variance)
• If the variance observed in one group performing in the same context is the same as the variance between different groups or the variance between different contexts for the same group then there is no significant effect of the factor we are investigating
- If variance compared conditions is greater than the error, then there may be an effect otherwise the data is from the same population + is not effected by anything particular

How well did you know this?

Not at all

Perfectly

What does ANOVA test?

• The central question ANOVA asks is “are 2 or more groups from the same population of scores” > basically, are the differences of individual scores in each subset of data similar in size + are differences across the whole dataset the similar size too
○ Are the groups equally as varied as eachother > is group 2 as varied as group 1, is group 3 as varied as group 2? Is the variance across all these groups similar in size? If it is, then we have no evidence of an effect

How well did you know this?

Not at all

Perfectly

Variance

• A dataset can be described by it’s typical value aka measures of central tendency and can also be described by measures of dispersion aka how spread out it is
• For instance, looking at the diagram, imagine you have two cricket bowlers, the line in the middle represents the target whilst the circles represent other places the ball ends up
• Both bowlers have the same average of getting the ball at the target but you are more likely to choose cricket bowler on the left to be on your team because there is a smaller spread (low dispersion) in comparison to the other bowler> can use a normal distribution to show this
- The right bowler has more dispersion so the data is more widely spread even though the means are the same

How well did you know this?

Not at all

Perfectly

Refer to example ANOVA in onenote

How well did you know this?

Not at all

Perfectly

What about within groups?

• Same experiment but all ppts are in each condition as their own control
• There will still be variance between the participants there are but this is called within-groups error variance
• There will also be between groups variance because there will be some kind of difference between each condition
• The error however will be slightly different to between-groups > there will be less error within-groups because if you compare a person to themselves, they are more similar to themselves on different occasions than they would be with a random person
○ This means when we represent this error, the curve will be more narrow because there is less between-groups variance in a within-group condition than a between-group condition because they are tested against themselves
- This means a within-groups analysis is more sensitive than a between-groups analysis because it can detect smaller effects

How well did you know this?

Not at all

Perfectly

Differences between groups are made up of 2 distinct sources of variance

Error: natural variation between people > the first person in condition 1,2 and 3 will naturally be different to each-other > these people are as different to each other as they are within each group
i. Can represent between-group error variance using the bell curve again + it should be a similar size because the people are just as different to each other across groups as they are within groups
Second source of variance is the effects from the factor we are studying > if we represent the between-groups variance as a curve, this curve will be more spread out than the error only curves seen so far
i. If we represent the between groups variance (not error variance, diff to above) on a curve, it will be more spread out than the other error curves seen so far > we are looking at the difference between the three conditions so the variance takes into account the lowest scores in condition 1 & 2 + highest scores in 3

How well did you know this?

Not at all

Perfectly

Visualising variance comparisons

• If you have three distributions on one line, you can compare how big the variance is and if there is a significant effect
• When comparing mean 1 to mean 2, we can see that the within group variance of mean 1 is smaller than the between groups variance between mean 1 and mean 2, which mean there is a significant effect from the factor because this is the difference between the groups
- In contrast, when comparing mean 2 to mean 3, the within group variance is smaller than the between group variance which means there is no significant effect from the factor so they are likely part of the same population

How well did you know this?

Not at all

Perfectly

Where do differences in individual scores come from?

• When using ANOVA, you may think about the source of variance > source of variance depends on the ANOVA design
• One-way ANOVA: is where you look at a single factor w/ a number of different conditions > in a one way ANOVA there are 2 sources of variance: 1. effect of the IV/factor 2. error aka how much people’s individual scores vary from the mean due to naturally occurring differences
• In statistical terms, the comparison of within group variance and between groups variance is captured in the F ratio.
• F ratio is a test statistic for each effect in ANOVA
• In a one-way anova there is one effect which is due to the IV we are studying so we report one F ratio
F = between groups variance/ within group variance

How well did you know this?

Not at all

Perfectly

F-ratios

• F ratio smaller than 1 = no significant difference between conditions + F ratio bigger than 1 = there is a significant difference between groups
○ This depends on design of study, no. of ppt and size of effect observed
○ Essentially, F ratio tests the likelihood of two variance estimates being from the same population
• The chance of any value of ‘F’ occurring has been calculated by statisticians > could be looked up in a table or use SPSS to provide an associated p value
○ A sampling distribution of ‘F’
○ F > 1 for significant effect

How well did you know this?

Not at all

Perfectly

Looking at ANOVA results

• Spss also allows us to generate estimate of effect size > more important in making sense of ANOVA results than just the likelihood of a particular F ratio occurring by chance
○ P value is useful but not as useful as the effect size
○ The effect size provides us with an estimate about how much of the variance observed in the data is actually due to the effect of the independent variable.
- The more variance is explained, the more important an effect is in scientific terms.

How well did you know this?

Not at all

Perfectly

What does a significant ANOVA tell us?

• A significant F ratio only tells us that there is a significant difference somewhere but doesn’t specify where > to interpret ANOVA fully, we need descriptive stats (means and standard deviation)
• We need to look at which condition has the highest mean and which has the lowest, and whether differences between the conditions match the hypothesis we tested (Are the differences in the direction that was predicted?) > is why we need descriptives
• However, even if the descriptives appear to support our hypothesis, we still have to check what differences between the pairs of conditions are significant.
- Need to carry out post hoc tests to follow up our analyses

How well did you know this?

Not at all

Perfectly

Post-hoc tests

• A post hoc comparison is a statistical test that allows us to identify which groups in an ANOVA are all significantly different to each other.
• One way to do this is by comparing pairs of conditions > e.g. comparing the laptop group to the laptop + review group or laptop group to the pen + paper group and laptop review group to paper +pen group > this means you may do 3 comparisons which increases familywise error rate
• What does ‘p = .05’ actually mean?
○ 1 chance in 20 (5% chance) that we observe a difference of the size we have found even though the null hypothesis is true
○ In other words, even though there is no effect. The p-value tells us nothing about how big or important an effect is.
• A p-value gives no information about the size or scientific importance of an observed effect > it just measures the probability of our conclusion being wrong > measures the chance of having a type I error + we want this chance to be as small as possible if we want to make claims with our findings

How well did you know this?

Not at all

Perfectly

Type I & Type II errors

• Type I error is more of a concern
• May give an uncomfortable treatment to someone when it actually doesn’t help their condition
• A type II error, if we stick to the ethos of the hypothetico-deductive approach, is less of an issue,
as we would not assume that the absence of evidence for the hypothesis means that the effect doesn’t exist.

How well did you know this?

Not at all

Perfectly

When type II errors can be a problem

refer to one-note

• Why are type I + II errors relevant to post hoc testing? Because the more tests we carry out on the same dataset, the greater the chance of making a type I error.
In fact, for each comparison, the chance of drawing the wrong conclusion about our data increases by the level of probability that we set as a criterion for significance.

How well did you know this?

Not at all

Perfectly

How does this relate to ANOVA?

• Imagine you have 2 groups you want to compare ( = 1 comparison) > because p = 0.05 aka 1 chance in 20, imagine you have 20 little boxes of chances, by making 1 comparison, you use one box because you take 1 chance of making a type I error when using p = 0.05
• If we look at 3 groups that means there will be 3 comparisons > because p = 0.05 that means there will be 1 chance in 20 for each comparison so overall 3 chances of a type I error occurring will be taken because there are 3 comparisons
○ This results in p being p = .15 (because 0.05 x 3 = 0.15) > this is bigger than our cut off of .05 > now we shouldn’t really claim our findings are significant
○ When we do all three comparisons at the same time, this is called the family of comparisons leading to the familywise error
- If we say multiple testing increases unacceptably our chances of making a type I error + saying there is an effect when there is not, then we are at a problem

How well did you know this?

Not at all

Perfectly

Avoiding a familywise error

Study These Flashcards

• Family of tests exceeds the acceptable likelihood for making a type 1 error when we use p smaller than .05. This suggests that one way we can get our chance of a type I error back to an acceptable level, even w/ multiple testing, is saying we will adjust the significance criterion to an acceptable level > this is called a Bonferroni Correction
• Bonferroni Correction: We divide the acceptable p value by the number of comparisons that we are going to make and the resulting number becomes the new criterion for statistical significance.
So for 3 comparisons, 0.05 will be divided by 3 which equal 0.016 > much smaller + a strict test

Data requirements for ANOVA

Study These Flashcards

Data needs to be interval/ratio level of measurement i.e. the adjacent scale points need to be equal distances apart.
Scores need to be normally distributed But ANOVA is fairly robust and so can deal with data that are a bit skewed.
There needs to be homogeneity of variance
we will use the rule of thumb that as long as the variance in the most varied condition is not bigger than three times the variance in the least very condition, we will treat the data as homogeneous.
Ideally, we also need a minimum number of data points, or participants, per cell or condition of the ANOVA. > That is so that we have variation in the data because without variation, you cannot analyse variance.
If these assumptions are violated then the non-parametric alternative can be used

Non-parametric tests

Study These Flashcards

• Have few to no assumptions about distribution so are sometimes called distribution free tests
• Ordinal (ranked) data used > easier to interpret than scale data but gives limited information > reduces the interval/ratio data to ordinal/nominal
• Non-parametric tests do not look at means or variance + because of this it lacks sensitivity compared to parametric tests > greater chance of failing to detect significant effects > more chance of type II error
- Best to use a parametric test where possible

Reporting non-parametric tests

Study These Flashcards

• When reporting parametric tests, you have to state what happened + specify
• We would state both that the assumptions have been violated and how they have been violated and then we would explain that this is why we used a non parametric analysis.
• Although non parametric tests do not use means and variances, some information about the groups analysed needs to be provided and what I would tend to ask for is the reporting of medians and minimum and maximum scores, although sometimes people report the inter quartile range. > SPSS does not generate the median, min + max score automatically (need to go on explore + do it yourself)
- Also report the required test statistics (workshop 3)

Ranking

Study These Flashcards

• Key part of non-parametric data analysis > need to turn interval/ratio data is ranked or ordinal data
• Start by putting all scores in ascending order (lowest to highest) across all conditions > then you give it a position so label the lowest one 1, next one 2, next 3 and on…
• Finally you need to provide each ordinal position with a rank (1,2,3,4) > but if there are scores with the same number and their ordinal positions are different (like original is 4 and 4, and ordinal is 2 and 3) then you have a tie and use the middle point of their ordinal position to allow them to share the rank > there are two ranks of 2.5 for each 4 > so rank is 2.5 as opposed to 2 and 3
how can we use ranks to look for differences in one way designs with more than two conditions?

Comparing more than 2 conditions: Kruskal-Wallis & Friedman’s test

Study These Flashcards

• Kruskal-Wallis test is equivalent of the one-way ANOVA with between-participants
- Friedman’s test is equivalent of the one-way ANOVA with within-participants

Kruskal-Wallis test

Study These Flashcards

• Compares 3 or more independent groups > similar to Mann-Whitney
- tests for differences between three or more conditions collecting data from different people in each condition using a separate group of ppts for each treatment

How does the Kruskal-Wallis test work?

Study These Flashcards

• Takes all scores + ranks them across all groups
• The question asked is: does one test condition result in estimations that are ranked consistently higher or consistently lower than any other condition?
• If there is a difference between conditions, the ranks for one condition will be systematically higher or lower than for the other condition > And if there is no difference, the ranks will be mixed up across all the conditions.
- E.g. if you look at the rank order of the diagram, each sample is concentrated in different areas in a systematic way (A then C then B) which suggests some kind of difference between samples whereas the other diagram shows a mix thus no difference

Kruskall-Wallis test hypothesis

Study These Flashcards

The Kruskal-Wallis tests this hypothesis: “The ranks in one condition are systematically higher or lower than the ranks in another condition. There are differences between conditions.”
This test only says whether there is a significant difference but not where this difference lies
Means we need to follow up a significant Kruskal-Wallis test with a post-hoc comparison > test used for post hoc here is the Mann-Whitney test (compares 2 independent groups)
But as with ANOVA, we also use the Bonferroni correction to determine the level of significance for Kruskall-Wallis test

Friedman's test

• Repeated measures non-parametric equivalent of the one-way ANOVA • Compared 3 or more related conditions • Similar to Wilcoxon using ranked differences • the Friedman test tests for differences between three or more conditions, collecting data from the same people in each condition or treatment. • Friedman test tests the following hypothesis: The ranks of the differences between a pair of conditions are systematically higher or lower than the ranks of the differences between another pair of conditions. • The test only says whether there is a significant difference but not where it is - A significant Friedman test needs 3 post hoc comparisons using the Wilcoxon test and also Bonferroni Correction.

Reporting a one-way ANOVA

• In reporting ANOVA results, include: ○ information about the descriptive statistics for the different groups ○ a summary of the test statistic including probability and effect size - comment on whether the findings supported the hypothesis or not.

RMC, W3 Flashcards

(26 cards)