Lecture 6 Flashcards

1
Q

What are the ANOVA assumptions?

A
  1. Normality
  2. Homogeneity of variance
  3. Independence of observations
  4. DV measured on an interval or ratio scale
  5. X (IV) & Y (DV) are linearly related
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explain the normality assumption of ANOVA.

A
  • For any value of x (the IV aka the raw scores) are approximately normally distributed.
  • In other words, the raw scores are normally distributed with in each group. Do a frequency distribution of the raw scores for each group.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the effect of violation of the normality assumption of ANOVA on type I and type II errors?

A

Type I error:

  • non normality only has a slight effect on type I error.
  • even for very skewed, or kurtotic (peakedness) distributions.
    e. g. nominal alpha (what we set alpha at = type I errror when all assumptions met) vs actual alpha (type I error if one or more assumptions are violated)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In really non-normal populations, when nominal alpha = .05, actual alpha = .055 or .06. If nominal alpha ~ actual alpha, what do we say?

A

We say F is robust to violations of the assumptions.

Therefore F is robust with respect to the normality assumption.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the reasons that F is robust with respect to the normality assumption?

A

The sampling distribution of the mean will be normally distributed if:

a) the raw scores are normally distributed in the population.
b) The raw sores in the population are skewed, the sampling distribution of the mean will approach a normal distribution as n increases (n greater than or equal to 30 or so).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define standard error of the mean:

A

The standard deviation o the sampling distribution of the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When would you use a non-parametric test? Why?

A

When the population is very skewed. Because non-parametric tests are distribution free, which means they don’t have the normality assumption.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What effect does lack of normality have on power?

A
  • Only a light effect (a few 100ths)

- Lack of normality due to platy kurtosis (flattened distribution) does affect power, especially if n is small.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does one check for normality?

A
  • Check via frequency distributions

- If big violation of normality with small n –> conduct a non-parametric test –> i.e. distribution doesn’t matter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are some examples of non-parametric tests?

A
Chi square 
Mann whitney
Wilcoxon
Kruskal-Wallace
Friedman
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe the Homogeneity (homoscedasticity) of variance assumption

A

i.e. variance (refers to error variance aka within group variance) is unaffected by the treatment i.e. the IV
i.e. MSerror
MSwithin
S/A
error due to chance
variability due to chance
etc…
i.e. σ²1 = σ²2 = σ²3 etc
In other words, for every value of x, the variance of y is the same.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Illustration of heteroscedasticity

A

Scores (y axis) and independent variable (x axis)

Each group’s scores grouped together above each group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Under what circumstances is F robust for unequal variances?

A

If n’s are equal or approximately equal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When is heterogeneity of variance an issue?

A

Only an issue if:

- n’s are sharply unequal and a test shows that the variances are sharply unequal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is meant by approximately equal n?

A

largest n/smallest n < 1.5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is meant by approximately equal σ²? (variance)

A

Largest variance/smallest variance > 3
If ratio is greater than 3, we have sharply unequal σ².
If Fmax > 3, then the variances are sharply unequal

17
Q

When is heterogeneity an issue for type I error? (case 1)

A

Case 1: If the largest variance is associated with the group with smallest n
F is liberal. i.e. actual alpha is going to be greater than nominal alpha.
i.e. falsely reject H0 too often
Solution: adjust nominal alpha downwards. e.g. .025 –> therefore actual alpha is approximately .05

18
Q

When is heterogeneity an issue for type I error? (case 2)

A

Case 2: If largest variance is associated with the group with largest n
F is conservative.
i.e. actual alpha is less than nominal alpha.
So people usually don’t make an adjustment.

19
Q

Explain the independence of observations assumption of ANOVA

A
  • Observations within each group are independent of one another.
  • Usually satisfied if unrelated subjects run individually and alone.
  • Usually satisfied if subjects run individually and alone.
  • REALLY IMPORTANT*
20
Q

Why is the independence of observations assumption of ANOVA so important?

A

Because even small violations have a substantial effect on both alpha and power.

21
Q

How is dependence measured?

A

Intraclass correlation

22
Q

Explain the DV is measured on an interval or ratio scale assumption of ANOVA

A

Check definitions against actual DV used.

If DV is nominal or ordinal, conduct a different type of statistical test. e.g. Chi square test

23
Q

Explain the X (IV) & Y (DV) are linearly related assumption of ANOVA

A

i. e. a subject’s score is comprised of 3 parts:
1. general effect (grand mean)
2. an effect that is unique and constant within a given treatment.
3. An effect that is unpredictable (random error & individual differences).

24
Q

Give the linear model of the fifth assumption for ANOVA

A

μ + alphaj + eij
where μ = grand mean
alphaj = treatment effect for the jth group
and eij = random error for the ith subject in the jth group
so:
general effect + treatment effect + error

25
Q

Define an outlier

A

a data point which is very different from the rest of the data.
Outliers can have a dramatic effect on results

26
Q

When removing outliers, what must be done?

A

Must explain why they were removed, this information must be shared.

27
Q

What causes outliers?

A
  1. Human error (eg data entry)
  2. instrumentation
  3. Subjects significantly different from the rest of the sample –> perhaps from a different population.

Therefore need to detect and remove outliers.

28
Q

How do you detect outliers for small samples?

A
  • The largest possible z score of a data set is bounded by: (n-1)/√n
    eg. for n=10, largest possible z score is 2.846, therefore for small samples, scrutinize any data point greater than or equal to z=2.5
29
Q

How d you detect outliers for large, normally distributed samples?

A
  • Approximately 99% of scores are within three standard deviations of the mean.
    Therefore z scores >3 should be scrutinized
    Note: If n>100 will get some z scores >3 by chance.
    A criteria of z>3 is also reasonable for non-normal distributions, but could extend it to z>4.
30
Q

What happens when subjects are run after analyses?

A

Tends to increase variability, therefore decrease probability of finding significance AND if N is really large, tend to get statistical significance no matter what, even if no practical significance.

31
Q

Why does running participants after analyses and having a very large N get statistical significance almost no matter what?

A

Because as N increases, standard error of the mean (sigmaYbar) decreases.