- tests overall fit of a linear model to a set of data - when model is based on group means, our predictions from the model are group means - different group means? good prediction, F will be high - similar group means? not a good prediction, F will be low (~1), fail to reject null compare improvement in fit due to using the model from the grand mean if the differences between group means are large enough, then the resulting model will be a better fit to the data than the grand mean (null)

Chapter 12: General Linear Model, Comparing Several Different Means Flashcards by Kimberly Echevarria

What does an ANOVA measure?

analysis of variance

when you are interested in comparing differences between the means in 3 or more independent groups

are population means the same or different?

want to know whats generally true about the groups, not too interested in the means of the sample (mostly mu)

How well did you know this?

Not at all

Perfectly

Why wouldn’t we use multiple independent samples t-tests instead?

increased chance of type 1 error (declaring there is an effect when there really is not)

familywise alpha increases over .05

FW alpha is the conditional probability of making one or more type 1 errors when performing multiple tests

How well did you know this?

Not at all

Perfectly

Are an ANOVA and regressions separate things?

No. Regression is a more general form of an ANOVA. Anything you can do with an ANOVA you can do with a regression.
Regression can handle both categorical and continuous predictors, while ANOVA can only handle categorical

the idea they are different is partially historical (regression = applied research; ANOVA = experimental)

How well did you know this?

Not at all

Perfectly

What are the methods for comparing independent means?

to keep FW <.05/maintaining solid power

the more liberal the alpha, the greater the power

Confidence intervals
standard linear model w/dummy coding (comparing groups to a base condition)
One way ANOVA
Welch or Brown-Forsythe F (Fbf)
Planned Contrasts

How well did you know this?

Not at all

Perfectly

Confidence Intervals

Positives:
-simple/straightforward
-makes you think about magnitude/focuses on estimation and avoids b&w thinking
Negatives:
- risk finding a difference when one doesn’t exist (type 1 error), increased with sample size
- with a small number of groups, the CIs are not going to be as sensitive to differences that do exist
- not as sensitive/powerful as other analyses

if CIs intersect too much, you can’t really draw conclusions
error bars represent an estimate of where the population mean is
larger bars = more uncertainty in the data. smaller number of cases means bigger error bars

interpretation: we are 95% confident that the population mean for X is between A & B

How well did you know this?

Not at all

Perfectly

Standard linear model with dummy coding

comparing groups to base condition

involve the use of a regression model
useful when you want to compare groups to base/control group
use k-1 to make a base group (number of groups -1)
check the magnitude of R^2 change and whether that change is significant

negatives:
- compare groups only with the control group
- since they are both comparing against some standard (control), they are not independent tests

interest in mu difference, if so, how dif

R^2 change tells us group membership accounts for about X% of the variance in IV
a sif f change means the means are significantly different from eachother

How well did you know this?

Not at all

Perfectly

In standard linear model with dummy coding on SPSS…

Start by looking at the f test. if it is significant, then look at the regression coefficients to see where those differences are. if it is not significant, you do not have to look at the coefficients.

allows you to see which means are different

How well did you know this?

Not at all

Perfectly

Logic of F statistic

tests overall fit of a linear model to a set of data
when model is based on group means, our predictions from the model are group means
different group means? good prediction, F will be high
similar group means? not a good prediction, F will be low (~1), fail to reject null

compare improvement in fit due to using the model from the grand mean

if the differences between group means are large enough, then the resulting model will be a better fit to the data than the grand mean (null)

How well did you know this?

Not at all

Perfectly

Linear Model: Overview/Summary

model of “no effect” or “no relationship between the predicot and outcome” is one where the predicted value of the outcome is always the grand mean
we can fit a different model to the data that represents an alternative hypothesis. we compare the fit of this model to the fit of the null (i.e., using the grand mean)
the intercept and one or more parameters (b) describe the model
the parameters determine the shape of the model thay we have fitted. therefore, the bigger the coefficients, the greater the deviation between the model and the null model (grand mean)

How well did you know this?

Not at all

Perfectly

Group means: Overview/Summary

in experimental research the parameters (b) represent the differences between group means
if the differences between group means are large enough, then the resulting model will be a better fit to the data than the null model (grand mean)
if this is the case: predicting scores from group membership better than simply using the grand mean. in other words, the group means are not all the same

How well did you know this?

Not at all

Perfectly

calculating an f statistic

quantify the amount of variability in the scores
## separate variability into 2 parts: the part that can be accounted for by group membership and the part that cannot be accounted for by group membership

more people = more residual
more groups = bigger model score

How well did you know this?

Not at all

Perfectly

SStotal equation

total amount of variability in the scores

SSt = s^2grandmean(N-1)

s^2grand = (Xi-Xgrand)^2

square each difference then add them all together

How well did you know this?

Not at all

Perfectly

SSmodel equation

how much variability accounted for by the model/group membership

SSm = n(Xgroupmean-Xgrandmean)^2

do for each group and add them all together

How well did you know this?

Not at all

Perfectly

SSresidual equation

how much variability isn’t accounted for by the model/group membership

SSr = s^2group(n-1)
or
SSr = SSt - SSm

do for each group then add them together

n is number of people in each group

How well did you know this?

Not at all

Perfectly

MSm equation

remove the effect of the # of groups/people

important because the SS get bigger the more groups/people you have

MSm = SSm/DFm

DFm = number of groups -1
k-1

How well did you know this?

Not at all

Perfectly

MSr equation

remove the effect of the # of groups/people

important because the SS get bigger the more groups/people you have

Study These Flashcards

MSr = SSr/DFr

DFr = total number of people - number of groups
N-k

F statistic equation

signal to noise

Study These Flashcards

F = MSm/MSr

look at associated p. make a conclusion. “Given the null, the proability of obtaining an F of 5.12 of higher is .025”

Assumptions when comparing means

Study These Flashcards

1) normality: assessed within individual group, not set of scores as a whole
2) independence: errors from each individual case are unrelated to eachother
***3) homogeneity: all comparison groups have the same variance

Homogeneity of variance importance/tests

Study These Flashcards

if group sizes are unequal, violationns of the assumption can have serious consequences (affects alpha and power) - not robust if these things are different
Levene’s test or Brown-Forsythe F/Welch’s F, robust version of F that doesn’t assume homogeneity

Levene’s Test

not recommended to use

Study These Flashcards

if Levene’s test is significant, then we conclude that the variances are significantly different and try to rectify the situation
if the sample size is small, this isn’t really powerful
if the sample size is large, even small violations will be statistically significant

Brown-Forsythe F and Welch’s F

control type 1 error rate, Welch’s has more power

not sure if the assumption of homogeneity is met

Study These Flashcards

versions of the F statistic designe to be accurate when the assumption of homogeneity is violated

always use adjusted F. If its not violated, it’ll be same as unadjusted

estimate the amount/degree to which homogeneity is violated, then adjusts F in accordance/proportionally (F is lower)

Is ANOVA robust to violations of assumptions?

robust: if you violate assumptions, it doesn’t matter bc alpha and power only change a little

Study These Flashcards

yes, when group sizes are equal (based on old research)
more recent research shows that is is more complicated, looks at more expanded conditions
skewness, non-normality (heavy tailed distributions), and heteroscedasticity (not the same variance across groups) interact in complex ways, for ex:
in the absence of normality, violations of homoscedasticity, alpha goes up to .18 from .05 (not good)
set up power for .9, contaminates 10% of scores from a normal distribution with greater variance, power drops to .28
scores are made to correlate moderately (r = .50), n = 10 per group, alpha = .74 from .05

Alternatives when assumptions aren’t met include:

Study These Flashcards

1) Welch’s F: if heterogeneity has been violated; adjusts for the amount that the assumption has been violated (always use it)
2) robust tests: comparing different means. if assumptions are met, ANOVA has more power than robust tests
3) Kruskal-Wallis: nonparametric test that doesn’t make any assumption about distribution

Post Hoc Tests

Study These Flashcards

used when we determine the means are different (sig ANOVA), but there is not a hypothesis on which groups are different
- follow up tests to an ANOVA to determine which groups are different
- kinda like p hacking, but not really because you control familywise alpha
- many post hoc tests, best one depends on the situation (keep FW alpha .05, allow power to be as high as possible) - dependent on assumptions and whether it is more important to not make Type 1 or Type 2 error.
- common ones: Least significant difference (LSD), Bonferroni, Tukey, REGWQ

perform badly when groups/variances are unequal

consist of pairwise comparisons that are designed to compare all different combinations of the tx groups (takes every pair of groups and performs a separate test on each

Types of post hoc tests, explained

1) LSD: no attempt to control type 1 error rate, like multiple t tests, but better because you start with an ANOVA 2) Bonferroni: when you absolutely have to control FW alpha, conservative, use when you really don't want to make Type 1 error 3) Tukey: conservative like Bonferroni but more powerful when testing large number of means (large number of groups) 4) REGWQ: good power and control over type 1 errors, probably best when n's (sample sizes) are equal

how do we know if results have scientific and/or practical significance?

quantify the effect size significant result do not mean they are important. it just means we are able to rule out sampling error as. the sole cause of the observed effect in the sample - R^2/ eta^2: total difference between the dif means and quantifying how big the effect is (overall effect across groups) - omega^2: R^2 of the population's overall effect. (Guidelines: .01 = S, .06 = M, .14 = L) - rcontrast^2: standardized, specific to two groups (Guidelines: .1 = S, .3 = M, . 5 = L) - cohen's d: dif between 2 groups in SD units (Guidelines: .2 = S. .5 = M, .8 = L) - mean differences

planned contrasts

used when you think you now which groups are different enter variables into regression equation. if regression coefficient is significant, then theres a significant difference between the 2 groups in that contrast | involve use of regression model ## Footnote fewer tests - FW alpha = .05 = more power

Chapter 12: General Linear Model, Comparing Several Different Means Flashcards

(27 cards)