- continous variable - random sample - normally distributed: shapiro wilk / Q Q plots - equal varaince within the groups: Levene's test

Lecture 10: ANOVA: F distribution and one way independent Flashcards by Megan H

wat is de algemene vraag bij een ANOVA

verschillen de populatiegemiddelden van meer dan 2 groepen significant van elkaar?

How well did you know this?

Not at all

Perfectly

wat is de gedetaileerde vraag bij ANOVA

krijg je meer predictive value als je kijkt naar het gemiddelde van de groepen, vergeleken met het grote gemiddelde van de hele steekproef? of maakt dat niet zoveel verschil?

How well did you know this?

Not at all

Perfectly

verschil ANOVA en independent t test

bij ANOVA kan je meer dan 2 groepen met elkaar vergelijken. bij independent t test niet

How well did you know this?

Not at all

Perfectly

wat doet de standaard deviatie/variance, wat is hun doel

om het verschil tussen de observatie en de grant mean te kwantificeren

How well did you know this?

Not at all

Perfectly

total sum of squares algemene beschrijving =

sum all the distances between the observations and the grand means, and square them

How well did you know this?

Not at all

Perfectly

wat betekent het dat de sum of squares minimized is

this will never get lower!! want de mean is de value die al het dichtste bij alle waarden zit. dus hij kan niet lager worden.

How well did you know this?

Not at all

Perfectly

error sum of squares =

the difference between the observed value and the predicted value of the group mean. It is also called the sum of squares residual (SSR) as it is the sum of the squares of the residual, that is, the deviation of predicted values from the actual values.

How well did you know this?

Not at all

Perfectly

dus welke sum of squares kijkt naar wat

total sum of squares= grant mean
error sum of squares = group mean

How well did you know this?

Not at all

Perfectly

wat is total sum of squares - error sum of squares

model sum of squares, dit laat dus zien hoeveel het model in te brengen heeft in het verhaal. als er niet zo’n groot verschil zit tussen de total en error sum of squares, heeft de model sum of squares dus niet zoveel toe te voegen, want dan is die waarde klein.

How well did you know this?

Not at all

Perfectly

adding an extra parameter to your model is ….

always going to lead to a etter approximation of the data!
but is it worth it????

How well did you know this?

Not at all

Perfectly

als de TSS en ESS hetzelfde zijn…

dan is de MSS = 0, dus dan voegt het nieuwe model niks toe: de means of the groups are equal to the grant mean.

How well did you know this?

Not at all

Perfectly

welk model wil je altijd kiezen

het meest simpele model. dus als er geen groot verschil is, dan wil je gewoon gaan voor de grant mean omdat dat het makkelijkste is.

How well did you know this?

Not at all

Perfectly

one way independent ANOVA: wat meet je?

compare 2 or more independent groups.

How well did you know this?

Not at all

Perfectly

assumptions ANOVA

continous variable
random sample
normally distributed: shapiro wilk / Q Q plots
equal varaince within the groups: Levene’s test

How well did you know this?

Not at all

Perfectly

andere benaming voor SSerror=

SSwithin

How well did you know this?

Not at all

Perfectly

andere benaming voor SSmodel=

SSbetween

How well did you know this?

Not at all

Perfectly

hoe bereken je SSmodel in R

n berekenen: length(group1) (gaat om het aantal participanten in de groep!!)
mean(group1)
mean(group)

vervolgens alles in de formule zetten, en voor elke group de SS berekenen. daarna summen

How well did you know this?

Not at all

Perfectly

dus waar staat ng voor, of n1/n2 etc

aantal participanten per group!!!!

How well did you know this?

Not at all

Perfectly

hoe kan je F visualizen in R

visualize.f(F, df_model, df_error, section=’upper’)

How well did you know this?

Not at all

Perfectly

wat betekent een non significante Levene’s test

A non-significant p value of levene’s test show that the variences are indeed equal and there is no difference in variances of both groups

How well did you know this?

Not at all

Perfectly

hoe worden de df van een ANOVA gerapporteerd

vanaf boven naar beneden: df model, daarna df error

How well did you know this?

Not at all

Perfectly

hoe moet je de effecten van een ANOVA rapporteren

There was a/no significant effect of x level on the y level, with F(df model, df error)= … and p= …
According/Contrary to expectations, planned contrasts revealed that ____________

How well did you know this?

Not at all

Perfectly

F= (2 formules)

MSmodel/MSerror, of SIGNAL / NOISE

How well did you know this?

Not at all

Perfectly

waar staat MS voor

Mean Sum of Squares

How well did you know this?

Not at all

Perfectly

dfmodel formule =

k-1 k= aantal condities

df error formule =

N - k

waar ligt de F distribution aan?

The F-distibution is different for different sample sizes and number of groups, because it depends on dfmodel (aantal condities) en dferror (aantal condities + sample size)

SSmodel formule

som: nk*(x̄k - x̄)^2 nk = aantal participanten bij elke conditie x̄k = mean van bepaalde conditie x̄ = grant mean

formule SSerror=

som: sk^2 * (nk - 1) sk^2 = variance per groep nk = aantal participanten per groep

dus, bij normale one way anova, heb je alleen de N nodig bij....

berekenen df error! (N-k) in de formules voor de SS gebruik je gewoon het aantal participanten per groep.

hoe bereken je SS total =

SSmodel + SStotal

mean squares van model =

SSmodel/dfmodel

mean squares van error =

SSerror/dferror

mean squares van total=

SStotal/dftotal

wat is de interpretatie van de F value

hoeveel beter is jouw model in het voorspellen van de waardes, vergeleken met de grant mean?

wat als F=1

dan is MSmodel=MSerror. dan predicten ze op dezelfde wijze, en is er dus geen verschil

ANOVA is hetzelfde als....

een regression!

the larger the F statistic, the ...

greater the varaition between two sample means relative to the variation of the samples -> more evidence for a difference between the group means

wat is de H0 bij een ANOVA

de grant mean is niet significantly different from the grant mean

hoe ziet een F distribution er uit

alleen maar vanaf 0 (want squares), 0 betekent geen verschil. verder gaat hij vanaf 0 naar beneden

wat is belangrijk bij contrasts

dat de values samen = 0 dus: 2, -1, -1 of 1, 0.5, 0.5 etc

wat is de relatie tussen MS en variance?

MStotal = sx^2

wat zijn contrasts

Planned comparisons of subgroups in the data (ook wel follow-up tests)

3 functies van contrasts

Exploring differences of theoretical interest Higher precision Higher power

wat is belangrijk bij contrasts

dat de values samen = 0

wat zijn post hoc analyses

Unplanned comparisons

2 doelen van post hoc analyses

Exploring all possible differences Adjust T value for inflated type 1 error Same procedure as contrasts, but: exploratory vs. exploratory research

wat is zo bijzonder aan de F distribution

hij is niet skewed, want hij kan niet negatief zijn (squares!!) dus daarom alleen positief, vanaf 0

welke df does sample size affect

- not df 1 (want model df = k-1) - wel df 2 (want error df = N - k)

wat is de H0 van ANOVA

x̄1 = x̄2 = x̄3 etc

is ANOVA robust?

no, kan nog steeds fout zijn als je de assumptions niet haalt.

p value interpretation

measures the probability of obtaining the observed results, assuming that the null hypothesis is true

an F test is always...

two sided!!!

wat vonden simons et al over false negative vs correctly negative

vaker false negative dan correctly negative

it is easy to accumulate statistically significant results for a false hypothesis

oke

most costly error =

type 1 error.

waarom is false positive error het meest kostbaar

- false positives are persistent in literature (want failure to replicate kan komen door veel verschillende dingen) - null findings worden niet gepost - false positives waste resources - publishing false positives = losing credibility for journals

hoezo verspillen false positives zoveel resources

- investeren in meer onderzoek wat tot niks leidt - ineffectieve policy changes

waardoor proberen scientists heel veel dingen en rapporteren ze alleen wat werkt

1. ambiguity in how to make a decision (researchers degree of freedom) 2. desire to find a statistically significant effect

effect size n^2 =

The amount of explained variance as a general effect size measure: R^2 = SSmodel/SStotal square root hiervan = cohens d want alleen SS error is influenced by sample!!! door df = N - k

omega squared w^2 =

less biased towards sample than n^2!

effect size r =

A more interpretable effect size measure, gives the effect size for a specific contrast

if both the df would be infinite, F distribution would be...

normal distributed

error variance houdt in...

hoeveel is het model nog steeds niet accounting for the data

wat zijn mogelijke interpretaties van een hoge F

- een group verschilt heeel veel van de rest - elke groep verschilt een beetje van elkaar etc. dus je weet nog niet veel over de hypothesen met de F statistic, alleen of er wel of niet ergens een verschil is!!!

dus waar zegt F wel wat over en waar niet

F zegt wel iets over de magnitude, niks over de direction

Q-Q plot doel

compare the distribution to a normal distribution

wat doe je bij post hoc tests

instead of asking specific questions (contrasts) you just do all the comparisons

dus wat is meer exploratory: contrasts of post hoc

post hoc, want daar vraag je geen specifieke vragen bij. je checkt gewoon zo'n beetje alles

wat moet je bij post hoc dus ook doen

adjust for the inflated type one error!! bv bonferroni

large F ratio =/= large effect

want hangt van veel dingen af, bv ook sample size etc

wat als de assumption of homogeneity wordt overschreden (levenes of sd's vergelijken)

dan correction gebruiken: brown-forsythe of welch

Lecture 10: ANOVA: F distribution and one way independent Flashcards

(73 cards)