Week 5: Comparing Means - One-way ANOVA Flashcards by Gitanjali Sharma

What does ANOVA stand for?

Analysis of Variance

How well did you know this?

Not at all

Perfectly

What

What is the decision tree for choosing a one-way ANOVA? - (5)

Q: What sort of measurement? A: Continuous
Q:How many predictor variables? A: One
Q: What type of predictor variable? A: Categorical
Q: How many levels of the categorical predictor? A: More than two
Q: Same or Different participants for each predictor level? A: Different

How well did you know this?

Not at all

Perfectly

When does ANOVA be used?

if you are comparing more than 2 groups in IV

How well did you know this?

Not at all

Perfectly

Example of ANOVA RQ

Which is the fastest animal in a maze experiment - cats, dogs or rats?

How well did you know this?

Not at all

Perfectly

We can’t do three separate t-tests for example what is the fastest animal in a maze experiment - cats, dogs or rats as - (2)

Doing separate t-tests inflates the type I error (false positive - e.g., pregnant man)

The repetition of the multiple tests adds multiple chances of error, which may result in a larger α error level than the pre-set α level - Family wise error

How well did you know this?

Not at all

Perfectly

What is familywise or experimentwise error rate?

This error rate across statistical tests conducted on the same experimental data

How well did you know this?

Not at all

Perfectly

Family wise error is related to

type 1 error

How well did you know this?

Not at all

Perfectly

What is the alpha level probability

probability of making a wrong decision in accepting the alternate hypothesis = type 1 error

How well did you know this?

Not at all

Perfectly

If we conduct 3 separate t-tests to test the comparison of which is the fastest animal in experiment - cats, dogs or rats with alpha level of 0.05 - (4)

5% of type 1 error of falsely rejecting H0
Probability of no. of Type 1 errors is 95% for a single test
However, for multiple tests the probability of type 1 error decreases as 3 tests together => 0.950.950.95 = 0.857
This means probability of a type 1 error increases: 1- 0.857 = 0.143 (14.3% of not making a type 1 error)

How well did you know this?

Not at all

Perfectly

Much like model for t-tests we can write a general linear model for

ANOVA - 3 levels of categorical variable with dummy variables

How well did you know this?

Not at all

Perfectly

When we perform a t-test, we test the hypothesis that the two samples have the same

mean

How well did you know this?

Not at all

Perfectly

ANOVA tells us whether three or more means are the same so tests H0 that

all group means are equal

How well did you know this?

Not at all

Perfectly

An ANOVA produces an

F statistic or F ratio

How well did you know this?

Not at all

Perfectly

The F ratio produced in ANOVA is similar to t-statistic in a way that it compares the

amount of systematic variance in data to the amount of unsystematic variance i.e., ratio of model to its error

How well did you know this?

Not at all

Perfectly

ANOVA is an omnibus test which means it tests for and tells us - (2)

overall experimental effect

tells whether experimental manipulation was successful

How well did you know this?

Not at all

Perfectly

An ANOVA is omnibus test and its F ratio does not provide specific informaiton about which

groups were affected due to experimental manipulation

How well did you know this?

Not at all

Perfectly

Just like t-test can be represented by linear regression equation, ANOVA can be represented by a

multiple regression equation for three means and models acocunt for 3 levels of categorical variable with dummy variables

How well did you know this?

Not at all

Perfectly

As compared to independent samples t-test that compares means of two groups, one-way ANOVA compares means of

3 or more independent groups

How well did you know this?

Not at all

Perfectly

In one-way ANOVA we use … … to test assumption of equal variances across groups

Levene’s test

How well did you know this?

Not at all

Perfectly

What does this one-way ANOVA output show?

Leven’s test is non-significant so equal variances are assumed

How well did you know this?

Not at all

Perfectly

What does this SPSS output show in one-way ANOVA?

F(2,42) = 5.94, p = 0.005, eta-squared = 0.22

How well did you know this?

Not at all

Perfectly

How is effect size (eta-squared) calculated in one-way ANOVA?

Between groups sum of squares divided by total sum of squares

How well did you know this?

Not at all

Perfectly

What is the eta-squared/effect size for this SPSS output and what does this value mean? - (2)

830.207/3763.632 = 0.22
22% of the variance in exam scores is accounted for by the model

How well did you know this?

Not at all

Perfectly

Interpreting eta-squared, what does 0.01, 0.06 and 0.14 eta-sqaured in one way ANOVA means? - (3)

0.01 = small effect
0.06 = medium effect
0.14 = large effect

How well did you know this?

Not at all

Perfectly

What happens if the Levene's test is significant in the one-way ANOVA?

then use statistics in Welch or Brown-Forsythe test

The Welch or Brown-Forsythe test make adjustements to DF which affects in one way ANOVA if Levene's test is sig

statistics you get and affect if p value is sig or not

What does this post-hoc table of Bonferroni tests show in one-way ANOVA ? - (3)

* Full sleep vs partial sleep, p = 1.00, not sig * - Full sleep vs no sleep , p = 0.007 so sig * - Partial sleep vs no sleep = p = 0.032 so sig

Diagram of example of grand mean

Mean of all scores regardless pp's condition

What are the total sum of squares (SST) in one-way ANOVA?

difference of the participant’s score from the grand mean squared and summed over all participants

What is model sum of squares (SSM) in one-way ANOVA?

difference of the model score from the grand mean squared and summed over all participants

What is residual sum of squares (SSR) in one-way ANOVA?

difference of the participant’s score from the model score squared and summed over all participants

The residuals sum of squares (SSR) tells us how much of the variation cannot be

explained by the model and amount of variation caused by extraneous factors

We divide each sum of squares by its

DF to calculate them

For SST its DF we divide by is in one-way ANOVA

N-1

For SSM its DF we divide by is one-way ANOVA so

number of group (parameters), k,

For SSM if we have three groups then its DF will be in one way ANOVA

3-1 = 2

For SSR we divivde by its DF to calculate which will be the in one way ANOVA

total sample size, N, minus the number of groups, k

Formulas of dividing each sum of squares by its DF to calculate it in one way ANOVA- (3)

* MST = SST (N-1) * MSR = SSR (N-k) * MSM = SSM/k

SSM tells us the total variation that the

exp manipulation explains

What does MSM represent?

average amount of variation explained by the model (e.g. the systematic variation),

What does MSR represent?

average amount of variation explained by extraneous variables (the unsystematic variation).

The F ratio in one-way ANOVA can be calculated by

If F ratio in one-way ANOVA is less than 1 then it represents a

non-significant effect

Why F less than 1 in one-way ANOVA represents a non-significant effect?

F ratio is less than 1 means that MSR is greater than MSM = more unsystematic than systematic

If F is greater than 1 in one-way ANOVA then shows likelhood ... but doesn't tell us - (2)

indicates that experimental manipulation had some effect above and beyond effect of individual differences in performance Does not tell us whether F-ratio is large enough to not be a chance result

When F statistic is large in one-way ANOVA then it tells us that the

MSM is greater than MSR

To discover if F statistic is large enough not to be a chance result in one-way ANOVA then

compare the obtained value of F against the maximum value we would expect to get by chance if the group means were equal in an F-distribution with the same degrees of freedom

High values of F are rare by in one way ANOVA are rare - (3)

by chance . Low degrees of freedom result in long tails of the distribution, so much like other statistics large values of F are more common to crop up by chance in studies with low numbers of participants.

The F-ratio tells us in one-way ANOVA whether model fitted to data accounts for more variation thane extraneous and does not tell us where

differences between groups lie

If F-ratio in one-way ANOVA is large enough to be statistically significant then we know

that one or more of the differences between means is statistically significant (e.g. either b2 or b1 i statistically significant)

It is necessary after conducting an one-way ANOVA to carry out further analysis to find out

which groups differ

The power of F statistic is relatively unaffected by

non-normality

when group sizes are not equal the accuracy of F is

affected by skew, and non-normality also affects the power of F in quite unpredictable ways

When group sizes are equal, the F statistic can be quite robust to

violations of normality

What tests do you do after performing a one-way ANOVA and finding significant F test? - (2)

* Planned contrasts * Post-hoc tests

What do post-hoc tests do? - (2)

* compare all pairwise differences in mean * Used if no specific hypotheses concerning differences has been made

What is the issue with post-hoc tests?

* because every pairwise combination is considered the type 1 error rate increases, so normally the type 1 error rate is reduced by modifying the critical value of p

Post-hoc tests are like two or one tailed hypothesis?

two-tailed

Planned contrasts are like one or two-tailed hypothesos?

One-tailed hypothesis

What is the most common modification of the critical value for p in post-hoc in one-way ANOVA?

Bonferroni correction, which divides the standard critical value of p=0.05 by the number of pairwise comparisons performed

Planned contrasts are used to investigate a specific

hypothesis

Planned contrasts do not test for every

pairwise difference so are not penalized as heavily as post hoc tests that do test for every difference

With planned contrasts test you dervivie the hypotheses before the

data is collected

Diagram of planned contrasts

Contrast 1 = Treatment vs control Contrast 2 = Treatment 1 vs Treatment 2

In planned contrasts when one condition is used it is

never used again

In planned contrasts the number of independent contrasts you can make can be defined with one way ANOVA

k (number of groups) minus 1

How does planned contrasts work in SPSS?

Coefficients add to 0 for each contrast (-2 + 1 +1) and once group used alone in contrast then enxt contrasts set coefficient to 0 (e.g., -2 to 0)|

SPSS has a lot of contrasts that are inbult but helpful if

you know what these contrasts do before entering the data as depend on the order in which you coded your vairables

What are weights?

Values we assign to the dummy variables e.g., -2 in the box

One type of planned contrasts is a polynominal contrast which in one way ANOVA

tets for trends in data and in its most basic form looks for lienae treat (i.e., group means increase proportionately)

Polynominal contrasts can also look at more complex trends other than linear such as in one way ANOVA?

quadratic, cubic and quartic

What does a linear trend represent?

simply proportionate change in the value of the dependent variable across ordered categories

What is a quadartic trend?

one change in the direction of the line (e.g. the line is curved in one place)

What is a cubic trend?

two changes in the direction of the trend

What is a quartic trend?

has three changes of direction

The Bonferroni post-hoc ensures that the type 1 error is below in one-way ANOVA?

0.05

With Bonferroni correction it reduces type 1 (being conserative in type 1 error for each comparison) it also in one way ANOVA?

lacks statistical power (probability of type II error will be high [ false negative]) so increasing chance of missing a genuine difference in data

What post hoc-tests to use if you have equal sample sizes and confident that your group variances are similar? in one way ANOVA

Use REGWQ or Tukey as good power and tight control over Type 1 error rate

What post hoc tests to use if your sample sizes are slightly different in one way ANOVA?

Gabriel’s procedure because it has greater power,

What post-hoc tests to use if your sample sizes are very different? ine one way ANOVA?

if sample sizes are very different use Hochberg’s GT2

What post-hoc test to run if Levene's test of homeogenity of variance is significant in one way ANOVA?

Games-Howell

# ** What post=hoc test to use if you want gurantee control over type 1 errror rate in one wau ANOVA?

Bonferroni

What does this ANOVA error line graph show? - (2)

* Linear trend as dose of Viagra increases so does mean level of libido * Error bars overlap indicating no between group differences

What does the within groups gives deails of in ANOVA table?

SSR (unsystematci variation)

The between groups label in ANOVA table tells us

SSM (systematic variation)

What does this ANOVA table demonstrate? - (2)

* Linear trend is significant (p = 0.008) * Quadratic trend is not significant (p = 0.612)

When we do planned contrasts we arrange the weights in such that we compare any group with a positive weight

with a negative weight

What does this output show if we conduct two planned comparisons of: one to test whether the control group was different to the two groups which received Viagra, and one to see whether the two doses of Viagra made a difference to libido - (2)

the table of weights shows that contrast 1 compares the placebo group against the two experimental groups, contrast 2 compares the low-dose group to the high-dose group

What does this table show if levene's test is non significant =equal variances assumed To test hypothesis that experimental groups would increase libido above the levels seen in the placebo group (one-tailed) To test another hypothesis that a high dose of Viagra would increase libido significantly more than a low dose one-way ANOVA - (3)

Signifiance value given in table is two-tailed and since hypothesis one-tail we divide by 2 for contrast 1, we can say that taking Viagra significantly increased libido compared to the control group (p = .0029/2 = 0.0145) . The significance of contrast 2 tells us that a high dose of Viagra increased libido significantly more than a low dose (p(one-tailed) = .065/2 = .0325)

If making a few pairwise comparisons and equal umber of pps in each condition then ... if making a lot then use. in one way ANOVA - (2).

Bonferroni Tukey

# a

Assumptions of ANOVA - (5)

* Independence of data * DV is continuous; IV categorical (3 groups) * No significant outliers; * DV approximately normally distributed for each category of the IV * Homogenity of variance = Levene's test not significant

Example of

ANOVA compares many means without increasing the chance of

type 1 error

In one-way ANOVA, we partiton the total variance into

IV and DV

Formula of effect size for one-way ANOVA

Formula for effect size of contrasts for one-way ANOVA - (4)

Less commonly, but no less importantly, we can report effect sizes for contrasts It follows the same logic as the r2 , but in this case we can use a formula that uses the value of t, which is given when contrasts are tested r2 = t2 / (t2 + df) Whether we are computing the effect size for the model as a whole or for contrasts the same intuitive feature of the r2 statistic exists - it shows what proportion of the variance is explained by the model

What happens if Levene's test is significant , no homogenity of variance,

If it is significant there are ways to modify the F test to account for it

An independent t-test is used to test for: A Differences between means of groups containing different participants when the sampling distribution is normal, the groups have equal variances and data are at least interval. B Differences between means of groups containing different participants when the data are not normally distributed or have unequal variances. C Differences between means of groups containing the same participants when the data are normally distributed, have equal variances and data are at least interval. D Differences between means of groups containing the same participants when the sampling distribution is not normally distributed and the data do not have unequal variances.

A differences between means of groups containing different participants when sampling distribution is normal and the groups have equal variances and data are at least interva

If you use a piared samples t-test A The same participants take part in both experimental conditions. BThere ought to be less unsystematic variance compared to the independent t-test. C Other things being equal, you do not need as many participants as you would for an independent samples design. D All of these are correct.

D All of these are correct

Which of the following statements about the t distribution is correct? A It is skewed BIn small samples it is narrower than the normal distribution CAs the degrees of freedom increase, the distribution becomes closer to normal DIt follows an exponential curve

C As the DF increase, the distribution becomes closer to normal

Which of the following sentences is an accurate description of the standard error? AIt is the same as the standard deviation BIt is the observed difference between sample means minus the expected difference between population means (if the null hypothesis is true) CIt is the standard deviation of the sampling distribution of a statistic D It is the standard deviation squared

CIt is the standard deviation of the sampling distribution of a statistic

A psychologist was interested in whether there was a gender difference in the use of email. She hypothesized that because women are generally better communicators than men, they would spend longer using email than their male counterparts. To test this hypothesis, the researcher sat by the computers in her research methods laboratory and when someone started using email, she noted whether they were male or female and then timed how long they spent using email (in minutes). Based on the output, what should she report? (NOTE: Check for the assumption of equality of variances). A Females spent significantly longer using email than males, t(14) = –1.90, p = .079 BFemales and males did not significantly differ in the time spent using email,t(7.18) = –1.90,p= .099 CFemales and males did not significantly differ in the time spent using email, t(7.18) = –1.90, p < .003 DFemales and males did not significantly differ in the time spent using email, t(14) = –1.90, p = .079

BFemales and males did not significantly differ in the time spent using email,t(7.18) = –1.90,p= .099

Other things being equal, compared to the paired-samples (or dependent)t-test, the independentt-test: A Has more power to find an effect. BHas the same amount of power, the data are just collected differently. CHas less power to find an effect. D Is less robust.

CHas less power to find an effect.

Differences between group means can be characterized as a regression (linear) model if: AThe outcome variable is categorical. BThe groups have equal sample size. CThe experimental groups are represented by a binary variable (i.e. code 1 and 0). DThe difference between group means cannot be characterized as a llinear model, they must be analyzed as an independent t-test.

The experimental groups are represented by a binary variable (i.e. code 1 and 0)

An experiment was done to look at whether different relaxation techniques could predict sleep quality better than nothing. A sample of 400 participants were randomly allocated to one of four groups: massage, hot bath, reading or nothing. For one month each participant received one of these relaxation techniques for 30 minutes before going to bed each night. A special device was attached to the participant’s wrist that recorded their quality of sleep, providing them with a score out of 100. The outcome was the average quality of sleep score over the course of the month. Which test could we use to analyse these data? A Regression only B ANOVA only C Regression or ANOVA D Chi-square

C (multiple) Regression or ANOVA (independent) as regression and ANOVA is the same Did not mention the hypothesis of prediction or it would be regression Chi-square only used when you have one categorical predictor and outcome is categorical

A researcher testing the effects of two treatments for anxiety computed a 95% confidence interval for the difference between the mean of treatment 1 and the mean of treatment 2. If this confidence interval includes the value of zero, then she cannot conclude that there is a significant difference in the treatment means: true or false. TRUE OR FALSE

TRUE

The student welfare office was interested in trying to enhance students’ exam performance by investigating the effects of various interventions. They took five groups of students before their statistics exams and gave them one of five interventions: (1) a control group just sat in a room contemplating the task ahead; (2) the second group had a yoga class to relax them; (3) the third group were told they would get monetary rewards contingent upon the grade they received in the exam; (4) the fourth group were given beta-blockers to calm their nerves; and (5) the fifth group were encouraged to sit around winding each other up about how much revision they had/hadn’t done (a bit like what usually happens). The final percentage obtained in the exam was the dependent variable. Using the critical values for F, how would you report the result in the table below? AType of intervention did not have a significant effect on levels of exam performance, F(4, 29) = 12.43, p > .05. BType of intervention had a significant effect on levels of exam performance, F(4, 29) = 12.43, p < .01. CType of intervention did not have a significant effect on levels of exam performance, F(4, 33) = 12.43, p > .01. DType of intervention had a significant effect on levels of exam performance, F(4, 33) = 12.43, p < .01.

Type of intervention had a significant effect on levels of exam performance, F(4, 29) = 12.43, p < .01.

Imagine you compare the effectiveness of four different types of stimulant to keep you awake while revising statistics using a one-way ANOVA. The null hypothesis would be that all four treatments have the same effect on the mean time kept awake. How would you interpret the alternative hypothesis? A. All four stimulants have different effects on the mean time spent awake B, All stimulants will increase mean time spent awake compared to taking nothing C. At least two of the stimulants will have different effects on the mean time spent awake D, None of the above

C. At least two of the stimulants will have different effects on the mean time spent awake

When the between-groups variance is a lot larger than the within-groups variance, the F-value is ____ and the likelihood of such a result occurring because of sampling error is _____ A small; high B small; low C. large; high D. large; low

D. large; low

Subsequent to obtaining a significant result from an exploratory one-way independent ANOVA, a researcher decided to conduct three post hoc t-tests to investigate where the differences between groups lie. Which of the following statements is correct? A. The researcher should accept as statistically significant tests with a probability value of less than 0.016 to avoid making a Type I error B. The researcher should have conducted orthogonal contrasts instead of t-tests to avoid making a Type I error C. This is the wrong method to use. The researcher did not make any predictions about which groups will differ before running the experiment, therefore contrasts and post hoc tests cannot be used D. None of these options are correct

The researcher should accept as statistically significant tests with a probability value of less than 0.016 to avoid making a Type I error

A psychologist was looking at the effects of an intervention on depression levels. Three groups were used: waiting list control, treatment and post-treatment (a group who had had the treatment 6 months before). The SPSS output is below. Based on this output, what should the researcher report? A. The treatment groups had a significant effect on depression levels,F(2, 45) = 5.11. B. The treatment groups did not have a significant effect on the change in depression levels,F(2, 35.10) = 5.11. C. The treatment groups did not have a significant effect on depression levels,F(2, 26.44) = 4.35. D. The treatment groups had a significant effect on the depression levels,F(2, 26.44) = 4.35.

D. The treatment groups had a significant effect on the depression levels,F(2, 26.44) = 4.35.

Imagine we conduct a one-way independent ANOVA with four levels on our independent variable and obtain a significant result. Given that we had equal sample sizes, we did not make any predictions about which groups would differ before the experiment and we want guaranteed control over the Type I error rate, which would be the best test to investigate which groups differ? A. Orthogonal contrasts B. Helmert C. Bonferroni D. Hochberg’s GT2

C. Bonferroni

The student welfare office was interested in trying to enhance students’ exam performance by investigating the effects of various interventions. They took five groups of students before their statistics exams and gave them one of five interventions: (1) a control group just sat in a room contemplating the task ahead (Control); (2) the second group had a yoga class to relax them (Yoga); (3) the third group were told they would get monetary rewards contingent upon the grade they received in the exam (Bribes); (4) the fourth group were given beta-blockers to calm their nerves (Beta-Blockers); and (5) the fifth group were encouraged to sit around winding each other up about how much revision they had/hadn’t done (You’re all going to fail). The student welfare office made four predictions: (1) all interventions should be different from the control; (2) yoga, bribery and beta-blockers should lead to higher exam scores than panic; (3) yoga and bribery should have different effects than the beta-blocker drugs; and (4) yoga and bribery should also differ. Which of the following planned contrasts (with the appropriate group codings) are correct to test these hypotheses? ANSWER 1 ANSWER 2 ANSWER 3 ANSWER 4

ANSWER 1 - sum of all weights should be 0

Deciding what post hoc tests to run

Example of RQ for one way ANOVA - (3)

Is there a statistically significant difference in Frisbee throwing distance with respect to education status IV = Education with 3 levels = high school, graduate, postgrad DV = Frisbee throwing distance

What does this one-way ANOVA output show? Research question: Is there a statistically significant difference in Frisbee throwing distance with respect to education status? Variables: IV - Education, which has three levels: High School, Graduate and PostGrad; DV - Frisbee Throwing Distance

There was homogeneity of variance as assessed by Levene's Test for Equality of Variances (F (2,47) = 1.94, p = .155)

What does the results of one-way ANOVA show? Research question: Is there a statistically significant difference in Frisbee throwing distance with respect to education status? Variables: IV - Education, which has three levels: High School, Graduate and PostGrad; DV - Frisbee Throwing Distance

There was a statistically significant difference between groups as demonstrated by one-way ANOVA (F(2, 47) = 3.50, p = .038).

What does the results of one-way ANOVA show? --> post hoc Research question: Is there a statistically significant difference in Frisbee throwing distance with respect to education status? Variables: IV - Education, which has three levels: High School, Graduate and PostGrad; DV - Frisbee Throwing Distance

A Tukey post hoc test shows that the PostGrad group was able to throw the frisbee statistically significantly further than the High School group (p = .034). There was no statistically significant difference between the Graduate and High School groups (p = . 691) nor between the Graduate and PostGrad groups (p = .099).

What is IV and DV of one -way ANOVA?

IV = 1 predicto Categorical with more than 2 levels DV = 1 Continous

one-way ANOVA is also called

between subject

Week 5: Comparing Means - One-way ANOVA Flashcards

(122 cards)