[L9] Analysis of Variance & [L10] The Kruskal-Wallis Test Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q
  • Very flexible and general technique, and the principles
    can be applied to a wide range of statistical tests.
A

ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

ANOVA is a ___ test

A

parametric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Has a wide range of applications.

A

ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Many of applications make some tricky assumptions
about the data.

A

ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

In ANOVA we measure an ___ variable (also called
a ___ variable).
* This outcome must be measured on a ___ scale.

A

outcome; dependent - continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

It is called dependent because it depends on one or more
__ variables

A

predictor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

__
_ variables can be Manipulated (Treatment) or
variables we simply measure (Sex).

A

Predictor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In ANOVA, predictor variables are mostly ___,
although continuous variables can also be used in the
same framework

A

categorical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When predictor variables are categorical, they are also
called “__“_

A

FACTORS or INDEPENDENT VARIABLES.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

___ – measurement of differences

A

ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Differences happen for two reasons: ___

A

(a) because of the
effect of predictor variables (b) because of other reasons

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In ANOVA, we want to know two things:
___

A
    1. How much of the variance (difference) between the
      two groups is due to the predictor variable
    1. Whether this proportion of variance is statistically
      significant, that is, it is larger than we would expect by
      chance if the null hypothesis were true?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

We can divide (statisticians sometimes say partition)
variance into three different types:
___

A
  1. The Total Variance
  2. Variance due to treatment, (Differences between Group)
  3. Variance due to Error (Differences within Group)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In ANOVA, the variance is conceptualized as sums of
_
__

A

squared deviations from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In ANOVA, the variance is conceptualized as sums of
squared deviations from the mean.
* It is usually shortened to___ and denoted by
__

A

sum of squares; SS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The 3 Sum of Squares

A
  1. Total Sum of Squares, SS total
  2. Between-groups Sum of Squares, SS between
  3. Error Sum of Squares, SS within
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

___– this is the variance
that represents the difference between the groups, and this
is called _
_

. Sometimes it refers to the betweengroups
sum of squares for one predictor, in which case it
is called SS predictor. Sometimes it is called___.

A

Between-groups Sum of Squares; SSbetween; SStreatment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

The ___-groups variance is the variance that we are
actually interested in.

A

between

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

We are asking whether the difference between the groups
(or the effect of the predictor) is big enough that we could
say it is ___

A

not due to chance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

_
_
_– also called within-groups sum
of squares.

A

Error Sum of Squares

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

It’s within the groups, because different people, who
have had the same treatment, have different scores.

A

Error Sum of Squares

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

They have different scores because of error. So this is
called either ___

A

SSwithin, or SSerror.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

We need to calculate the three kinds of Sum of Squares,
___

A

TOTAL, WITHIN GROUPS, and BETWEEN
GROUPS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

_
_
_sum of squared differences between the mean
and each score.

A

SStotal –

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

___
* To know how large the effect of the treatment has been

A

Calculating the Effect Size:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

The same as asking what
___ the treatment effect
has been responsible for.

A

proportion of the Total
Variance (or Total Sum of Squares)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Effect Size goes under two different names: these are ___.

A

RSquared
or eta-Squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Mean Squares. Often written as MS.
* These are ___

A

MSbetween, MSwithin, MStotal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Three sets of degrees of freedom

A
  • df total, df between, and df within
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Finally we calculate the relative size of the two values,
by dividing MS between by MS within.
* This gives us the statistic for ANOVA, which is called F,
or sometimes the
__
__

A

F-ratio.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

To find the probability value associated with F we need
to have two sets of degrees of freedom, the ___

A

between and
within.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

__
_are exactly the same test.
* It is just a different way of thinking about the result (when we have two groups).

A

ANOVA and t-test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

In fact, if we take the ___and square it. We get the
value of F.

A

value of t

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

This is a general rule when there are 2 groups:

___

A

F = t-squared.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Question: If we covered t-tests, why are we doing it
again?
_

A
  • t-test – restricted to comparing 2 groups.
  • ANOVA extends in a number of useful directions.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

___ extends in a number of useful directions.
* Can be used to compare 3 groups or more, to calculate
the p-value associated with the Regression Line, and ina
wide range of ___ situations

A

ANOVA, other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

When there are 2 groups, ANOVA is equivalent to a
___ and it therefore makes the same assumptions as
the t-test.

A

t-test,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

same assumptions as
the t-test. and it makes these assumptions regardless of the number
of
_
__ that are being compared

A

groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Assumptions in ANOVA

A
  1. Normal distribution within each group
  2. Homogeneity of Variance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

We do not assume that the outcome variable is normally
distributed.
* What we do assume is that __ each group are
normally distributed.

A

data within; Normal distribution within each group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

_
* As with the t-test, we assume that the standard deviation
within each group is approximately equal.

A

Homogeneity of Variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

the variance being the square of the
.

A

SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q
  • However, as with the t-test, we don’t need to worry
    about this assumption, if we have approximately equal

__
_ in each group.

A

numbers of people

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

ANOVA comparing Three Groups

A
  • Formulae are all the same.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

__most elementary analysis of
variance

A

One way ANOVA –

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

One way ANOVA – Also called as __

A

simple-randomized groups design,
independent groups design, or the single factor
experiment, independent groups design.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

_
__that is being investigated, there
are two or more levels or conditions of the IV, and
subjects are randomly assigned to each condition.

A

Only one IV (one factor); one-way anova

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

ANOVA – not limited to ___experiments.

A

single factor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

The effect of many different __ may be investigated
at the same time in one experiment

A

anova; factors

50
Q

_ – one in which the effects of two
or more factors or IVs are assessed in one experiment

A

Factorial experiment

51
Q

Conditions or treatments used are combinations of the
__

A

levels of factors.

52
Q

_more complicated, However, we get a lot more information

A

Two way ANOVA –

53
Q

Two way ANOVA –It allows in one experiment to evaluate the __ of two
IVs and the __ between them.

A

effect; interaction

54
Q

__– the levels of each factor were
systematically chosen by the experimenter rather than
being randomly chosen

A

Fixed effects design

55
Q

We want to determine whether factor A has a
significant effect, disregarding the effect of factor B.
This is called the __

A

main effect of factor A.

56
Q

We want to determine whether factor B has a
significant effect, without considering the effect of factor
A. This is called the ___

A

main effect of factor B.

57
Q

finally, we want to determine whether there is an
interaction between factors A and B. This is called the
__

A

interaction effect of factors A and B.

58
Q

Three analyses in fixed effects design:

A
  1. main effect of factor A.
  2. main effect of factor B.
    3.interaction effect of factors A and B.
59
Q

In analyzing data from a two-way ANOVA, we determine
four variance estimates:

A
  1. MS within cells
  2. MS rows
  3. MS columns
  4. MS interaction
60
Q

The estimate ___ is the within cells variance
estimate and corresponds to the within groups variance
estimate used in the one-way ANOVA

A

MS within cells

61
Q

It becomes the
__ against which the other
estimates, MS rows, MS columns, and MS interactions,
are compared.

A

standard

62
Q

The other estimates are sensitive to the__

A

effects of the IVs.

63
Q

The estimate MS rows is called the_

A

row variance
estimate.

64
Q

row variance
estimate is based on the variability of the row means and,
hence, is sensitive to the_

A

effects of variable A.

65
Q

The estimate MS columns is called the _

A

column variance
estimate.

66
Q

column variance
estimate is based on the variability of the column means and,
hence, is sensitive to the_

A

effects of variable B.

67
Q

The estimate MS interaction is the __

A

interaction variance
estimate

68
Q

interaction variance
estimate is based on the variability of the cell means and,
hence, is sensitive to the _

A

interaction effects of variables
A and B.

69
Q

If variable A has no effect, MS rows is an
__of the __

A

independent
estimate; σ-squared.

70
Q

Finally, if there is no interaction between variables A and
B, MS interaction is also an _

A

independent estimate of σ –
squared.

71
Q

Thus, the estimates MS rows, MS columns, and MS
interaction are analogous to the ___

_
of the one-way ANOVA design

A

between-groups variance estimate

71
Q

Each F (or F obtained) value is evaluated against
__
(critical value) as in the one way analysis

A

F crit

72
Q

In a two-way ANOVA, we can essentially two one-way
experiments, plus we are able to evaluate the interaction
between the __

A

two independent variables.

73
Q

In a 2-way ANOVA, we partition the total sum of
squares (SS total), into four components:

A
  1. the withincells sum of squares,
  2. the row sum of squares,
  3. the column sum of squares,
  4. and the interaction sum of
    squares.
74
Q

When these Sum of Squares (SS) are divided by the
appropriate degrees of freedom, they form four variance
estimates..

A

(MS within-cells, MS rows, MS columns and
MS interaction

75
Q

Only difference is that with the row sum of squares we
use the __ means, whereas the between-groups sum of
squares used the __ means.

A

row; group

76
Q

In ANOVA we aim to find out if there are differences
between the groups, but not _
_

A

what those differences are.

77
Q

Usually, we test the hypothesis that: μ1 = μ2 = μ3
* In the case of two groups, this is not a problem, because
if the mean of group 1 is different from the mean of
group 2, that can only happen in __

A

one way.

78
Q

However, when we reject a null hypothesis when there
are three or more groups, we aren’t really saying
enough.
* We are just saying that group 1, group 2, and group 3
(and so on, up to group k) are ___

A

not the same.

79
Q

Unlike the two-group solution, this can happen in __

A

lots of
ways.

80
Q

_
to answer the question of where the
differences come from.

A

Post Hoc tests –

81
Q

Post hoc” is Latin, and means “

A

after this”.

82
Q

Post hoc tests are tests done after
__.
* They are based on –

A

ANOVA; t-tests.

83
Q

It is possible to just do t-tests to compare groups, but this
would cause a problem called -

A

alpha inflation

84
Q

Alpha is the Type__error rate.

A

I

85
Q

A _
_is where we reject a null hypothesis that is
true.

A

Type I error

86
Q

The probability value that we choose as a cut-off
(usually 0.05) is the __.

A

Type I error rate

87
Q

That is, it is the probability of getting a __ result
in our test, if the population effect is actually zero.

A

significant

88
Q

When 3 tests are done, a cut-off of 0.05 is used, and most
think that the probability of a Type I error is __.

A

still 0.05

89
Q

We call 0.05 our _
_error rate, because that
is the Type I error rate we have named.

A

nominal Type I

90
Q

The problem is that the Type I error rate has __, and it is no longer our true type I error rate.

A

risen above
0.05

91
Q

When we do multiple t-tests, following an ANOVA, we
are at risk of
__.

A

capitalizing on chance

92
Q

The probability that one of those tests will be statistically
significant is not 0.05, but is actually closer to __

A

three
times 0.05 or 0.15, about 1 in 7.

93
Q

So our actual type I error rate is much – than our
nominal rate.

A

higher

94
Q

we need to perform some sort of __
and we can’t use our plain ordinary t-test.

A

modified test

95
Q

-
* Assumption of homogeneity of variance

A

Bonferroni Correction

96
Q

Bonferroni Correction

What is done here is to calculate the pooled standard
error, and then calculate three t-tests using this
__

A

pooled
standard error.

97
Q

However, there are 2 reasons, why we are not going to do this. (Bonferroni Correction)

A

1st: it is tricky
2nd: It is so unintuitive.

98
Q

__
* When there are two groups, we calculate the standard
error, and then calculate the confidence interval, based
on multiplying the SE by the critical value for t. at 0.05
level.

A

Bonferroni Corrected Confidence Intervals

99
Q

Bonferroni Corrected Confidence Intervals: We carry out the same procedure, except we are no
longer using the
__

A

95% level.

100
Q

We have to adjust alpha by dividing by __, to give 0.0166.

A

3

101
Q

We then calculate the critical value for _
_, using the new
value for alpha.

A

t

102
Q

To calculate the confidence intervals, we need to know
the _

A

critical value of t.

103
Q

Since we are now using the value of alpha corrected for
the number of tests (say 3), we are now going to be
doing __, so we need to use 0.05/3 = 0.0166.

A

three tests

104
Q

Before we can determine the critical value, we need to
know the
__
* The df are calculated in the same way as the t-test. That
is, df = N-2, where N is the total sample size for the two
groups we are comparing.

A

degrees of freedom (df).

105
Q
  • Calculation of statistical significance is also
    straightforward once we have the standard errors of the
    differences-
A

Bonferroni Corrected Statistical Significance

106
Q

The value for t is equal to the __

A

difference divided by the
standard error of difference.

107
Q

Bonferroni Correction
__ – to find probability value.

A

Computer

108
Q

Bonferroni Correction
Two advantages:

A
  1. it controls our type I error rates,
    which means that we don’t go making spurious
    statements. 2. it is easy to understand.
109
Q

Whenever we do ___, we can Bonferroni
correct by multiplying the probability value by the
number of tests, and treating this as the probability
value.

A

multiple tests

110
Q

Or equivalently, dividing our cut-off by the ___, and rejecting only null hypotheses that have
significance values lower than that cut-off.

A

number of
tests

111
Q

Problem: Bonferroni Correction

A

it is a very unwieldy and very blunt tool.
* Not that precise.
* The p-values required for statistical significance rapidly
become very small.

112
Q
  • Non-parametric test used with independent groups design.
A

The Kruskal-Wallis Test

113
Q

Substitute for one-Way ANOVA if assumptions are
violated.

A

The Kruskal-Wallis Test

114
Q

The Kruskal-Wallis Test Does not assume population __

A

normality or homogeneity of
variance.

115
Q

The Kruskal-Wallis Test: Requires only __ scaling of
__ variable

A

ordinal; dependent

116
Q

Kruskal-Wallis Test: The statistic we compute is .

A

H

117
Q

Kruskal-Wallis Test
Step 1:

A

All of the scores are grouped together and rank-ordered, assigning the rank

118
Q

Kruskal-Wallis Test
Step 2:

A

When this is done, the ranks for each condition or sample are
summed- evaluate stats

119
Q

To use the Kruskal-Wallis test, the data must be of at least
__ scaling.

A

ordinal

120
Q

there must be at least __ scores in each sample to
use the probabilities given in the table for Chi-square.

A

five