Chpt 13 Flashcards

1
Q

What does ANOVA stand for

A

Analysis of Variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is the F- ratio

A

is the ratio of 2 variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does ANOVA allow us to do?

A

allows us to compare multiple pops and even subgroups of these pops
- how two groups interact with each other quantitatively

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What question does ANOVA help us answer

A

do all 3 means come form a common population
- we are not asking if they were exactly equal. we are asking if each mean likely came from the larger overall population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the Null hypothesis for ANOVA

A

HO= M1 = M2 = M3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the problem with using pairwise comparison for 3 pop means

A

the type I error will compound with each t-test
95% confidence = (.95)(.95)(.95) = .857
so, a (or critical value) would be come 1 - .857 =
143

Type 1 error rate went from 5% (0.05) to 14.3%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is partitioning

A

separating total variance into its component parts

- we do this by using ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the variability between the means

A

distance from overall mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

if the variability between the means (distance from overall mean) in the numerator is relatively Large compared to the variance within the samples the ratio will be

A

much larger than 1

  • the samples mostly likely do NOT come from a common pop
  • reject Ho that means are equal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the variability within the samples called

A

internal spread (the denominator)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If the F ratio is similar/similar what does this tell us

A
  • Fail to reject Ho

- means are fairly close to overall mean and/ or distributions overlap a bit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

If the F ratio is Small /Large

A
  • Fail to reject Ho

- the means are very close to overall mean and/or distributions melt together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the formula for f ratio

A

B/W / W/in or Among / Around

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

variance b/w + Variance w/in (error variance) =

A

Total variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Factor definition

A

independent variable (ie. assembly method)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the required assumptions for ANOVA

A
  1. normally distributed
  2. distributions must be independent
  3. the variance of the response variable (Qsquared) is the same for all pops
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the steps to ANOVA

A
  1. Calculate sample mean for each pop
  2. calculate overall mean for all pops (add up all means / # of means)
  3. Estimate the variance (Xbar 1- Overall mean)squared / n-1
  4. compute the sum of squares b/w treatments
  5. computer mean squares b/w treatments
  6. calculate sum of squares due to error
  7. calculate the mean squares due to error
  8. Setup the ANova table
  9. calculate f-ratio and p-value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what is SSTR stand for

A

sum of squares b/w treatments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what is MSTR

A

mean square b/w treatments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what is SSE

A

sum of squares due to error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is MSE

A

mean square due to error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what is the formula for SSTR

A

sum (# of sample)(pop1 mean - overall mean)sqaured (do for each set of pops)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the formula for MSTR

A

SSTR/k-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the formula for SSE

A

(# of samples)(Variance of pop1) + (# of samples) (Variance of pop2) + (# of samples)(Variance of Pop3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is the formula for MSE
SSE/nr-k
26
What is the F-ratio formula
MSTR/MSE
27
what is k-1
3 pops - 1 = degrees of freedom
28
What is nr-k
total # of sample for all 3 pops (ex. each contain 5 samples than nr = 5x3 k = total pops (in this case 3) so df = 15 - 3
29
What is Fishers LSD
remember ANOVA tells us if at least 2 of the groups are different from each other - Fisher's LSD tests 2 specific groups against each other
30
what does LSD stand for
LEast Significant Difference
31
What is the formula for Fisher's LSD
t a/2 x square root of MSE (1/n1 + 1/n2)
32
what is t a/2
critical value using within degrees of freedom and alpha / 2
33
What do you compare LSD to
(xbari - xbarj) - reject if (xbar i - xbarj) is greater than or equal to LSD - do this for each group
34
LSD is used to determine
where the differences occur
35
what is the null hypothesis for LSD
HO: Mi = Mj
36
what is the test statistic for LSD
t = (xbar i - xbarj) / square root of MSE (1/ni + 1/nj)
37
What is the rejection rule for LSD - pvalue approach
reject HO if Pvalue is less than or equal to a (CV)
38
what is the rejection rule of LSD - cv approach
reject HO if t is less than or equal to - t a/2 or | t is greater than or equal to t a/2
39
what is the rejection rule of LSD - cv approach
reject HO if t is less than or equal to - t a/2 or | t is greater than or equal to t a/2
40
what is the degrees of freedom for LSD and t- distribution
t a/2 is based on a t-distribution with nT-k degrees of freedom what is T???
41
LSD and Confidence intervals - if the confidence interval includes the value 0
we cannot reject Ho, that the pop means are equal
42
If the LSD confidence interval does not include the value 0
we can conclude there is a difference in pop means | - do not reject Ho
43
what is a Comparisonwise Type1 Error rate
indicate the level of significance associated with a single pairwise comparison a = 1-.95 = 0.05
44
What is a experimentwise type 1 error rate
Prob we will not make a type 1 error for all 3 tests | .95)(.95)(.95 - this gets larger the more groups you have
45
What is the experimentwise type 1 erorr rate denoted as
aEW
46
What is the experimentwise type 1 erorr rate denoted as
aEW
47
How do we control the overall experimentwise error rate
use Bonferroni Adjustment
48
What is the Bonferroni Adjustment
we use smaller comparisonwise error rate for each test
49
What is the formula for Bonferroni adj
aEW / C (C to test c pairwise comparisons) ex. a = 0.05 / 3 pops = 0.017
50
What are some other procedures we could use to control the overall experimentwise error rate
1. Tukey's procedure | 2. Duncan's multiple range test
51
When is randomized block design used
useful when the experimental units are homogenous
52
What do we use if exeperimental units are heterogenoeous
Blocking is often used to form homogenous groups
53
Problem with Randomized block design? (double check this is what it is referring to)
can arise whenever differences due to extraneous factors (ones not considered in the experiment) cause the MSE term to become too LARGE - this can cause the f-value to be small, signaling no difference among treatment means when in fact a difference exists
54
HOw do you compute f-ratio for randomized block design
F = MSTR/MSE
55
In our example what would the workstation be
the factor of interest
56
in our example of randomized block design what would the controllers be
the blocks
57
what would the treatments be in a randomized block design
the pops | - 3 treatments (or pops) associated with workstation factor correspond to the 3 workstation alternatives
58
what would the treatments be in a randomized block design
the pops | - 3 treatments (or pops) associated with workstation factor correspond to the 3 workstation alternatives
59
What is the randomized aspect
is the random order in which the treatments (systems) are assigned to controllers - 6 controllers were selected at random and assigned to operate each of the systems - a follow up interview and a medical exam of each controlelr in the study provided a measure of stress for each controller on each system
60
What is SST = for randomzied block design
SST = SSTR + SSBL + SSE
61
What does k represent in randomized block design
the # of treatments
62
What does b represent in randomized block design
of blocks
63
What does nT represent in randomzied block design
total sample size (nT = kb)
64
What are the steps in randomized block design
1. compute SST (total sum of squares) 2. Compute SSTR (Sum of squares due to treatments) 3. Compute SSBL Sum of Squares due to blocks 4. Compute SSE (sum of squares due to error)
65
What is the formula in randomized block design for SSE
SSE = SST - SSTR - SSBL
66
What is the formula in randomized block design for SST
sum (Xbar - total block mean) squared
67
What is the formula in randomized block design for SSTR
(# in sample){sum (treatment mean - block mean)squared
68
What is the formula for SSBL
(# of pops) [(block mean - total block mean) squared]
69
What does SSBL mean
sum of squares due to blocks
70
What is SST
Total sum of squares
71
what is the degrees of freedom for SSTR
k- 1 ( # of pops - 1)
72
what is the degrees of freedom for SSBL in randomized block design
b-1 (# of blocks -1)
73
What is the degrees of freedom for SSE in randomized block design
(k-1)(b-1)
74
What is the degrees of freedom for SST in randomized block design
nT-1
75
read notes - i left some out
ready notes i left some out
76
Describe a factorial experiment
an exerimental design that allows simultaneous conclusions about 2 or more factors
77
Why use Factorial
used becasue the experimental conditons include all possible combinations of hte factors
78
Give an example of a Factorial Experiment
study involving (GMAT) - scores range form 200 to 800 higher scores imply higher aptitude - to impreove the GMAT scores, consider 3 prep programs - each program has 3 treatments (the program they are in business, Engineering, Arts) - second factor - whether a student's undergrad affects the GMAT score (college)
79
What would be if we have 3 treatments (prep programs for GMAT) combinations in factorial design if we have 2 factors factor 1 - the prep program factor 2 - college attended
3 x 3 = 9 treatment combinations
80
What is replications
the sample size of 2 for each treatment combination indicates we have 2 replications
81
What is the formula in Block design for SST
sum (sample - overall mean)sqaured (for all samples)
82
What is the formula in block design for SSTR
of blocks [(treatment mean - overall treatment mean) sqaured) + (Treatment mean#2 - overall treatment mean) squared) + (treatment mean #3 - overall treatment mean) Squared)
83
ANOVA Table Deifnition
A table used to summarize the analysis of variance computations and res­ ults. It contains columns showing the source of variation, the sum of squares, the degrees of freedom, the mean square, and the F value(s)
84
Blocking Definiton
The process of using the same or similar experimental units for all treat­ ments. The purpose of blocking is to remove a source of variation from the error term and hence provide a more powerful test for a difference in population or treatment means.
85
Comparisonwise Type I error rate - Definition
The probability of a Type I error associated with a single pairwise comparison.
86
Completely randomized design - Definition
An experimental design in which the treatments are randomly assigned to the experimental units.
87
Experimental units - Definition
The objects of interest in the experiment
88
Experimentwise Type I error rate - Definition
The probability of making a Type I error on at least one of several pairwise comparisons
89
Factor - Definition
Another word for the independent variable of interest
90
Factorial Experiment - Definition
An experimental design that allows simultaneous conclusions about two or more factors
91
Interaction - Definition
The effect produced when the levels of one factor interact with the levels of another factor in influencing the response variable.
92
Multiple comparison procedures - Definition
Statistical procedures that can be used to conduct statistical comparisons between pairs of population means
93
Partitioning - Definition
The process of allocating the total sum of squares and degrees of freedom to the various components.
94
Randomized block design - Definition
An experimental design employing blocking.
95
Replications - Definition
The number of times each experimental condition is repeated in an experiment
96
Response variable - Definition
Another word for the dependent variable of interest.
97
Single-factor experiment - Definition
An experiment involving only one factor with k populations or treatments
98
Treatments - Definition
Different levels of a factor
99
If you have 300 treatments, total is 460 and you have 7 experimental units used for each of the 5 levels of the factor, what is the degrees of freedom for the 300 treatments?
``` # of samples = 5 levels = 5 n-1 = 5-1 = 4 df = 4 ```
100
If you have 300 treatments, total is 460 and you have 7 experimental units used for each of the 5 levels of the factor, what is the sum of squares due to error and what is the degrees of freedom
460-300 = 160 df = total samples size - # of samples = 7x5 =35 - 5 =30
101
What is ANOVA interested in
the status of the populations that generate the data sets, and not in the data sets themselves
102
What are some assumptions in ANOVA
1. assume that the data in each data set have come form a single pop 2. assume that all the pops have the same Q^2
103
How do we interpret the variation in the data sets for ANOVA
we interpret the variation in each data set as being caused by small and random sources, collectively called error
104
What is a question regarding ANOVA's Errors
the question, then, is should we regard the variation in the values across the data sets as "error" or should they be attributed to some other source of variation that is not random (ie different pops)
105
What is used as the benchmark in ANOVA
variation within the data sets (SSE)
106
what is used to compare the benchmark to in ANOVA
variation between the data sets (SSTR) is compared
107
In ANOVA if the between variation is much larger than the within variation, we may conclude what
that the data sets have in fact been generated form different populations
108
If we assume that the pops have the same Q^2, we can pool these variations and obtain one measure of the
within variation (SSE) - this is the benchmark
109
MSE is an extension of the concept of what
pooled variance
110
in the case of two samples, the formula for MSE reduces to what
S^2p
111
In the case of two samples, we can prove that the F ratio obtained form the ANOVA table equals the
square of the t value obtained from applying the two-sample t test F=t^2
112
The F test is a direct extension of the
t test for testing the equality of the population means of several populations, with the assumption that the populations are normally distributed with a common variance
113
The total sum of squares and the total degrees of freedom are ________. so, any variation and any degrees of freedom left over from the total variation and total degrees of freedom unaccounted for by the "between" source goes to the ______ _______.
fixed Within source
114
What does the mean squares column in the ANOVA table give us
the relative effect of each source.
115
If relatively speaking, the systematic effects are larger than the random effects, this results in a ________F value and the hypothesis of equal means must be -_______
Large F value rejected
116
What is the systematic effect
Between Source
117
What is the nonsystematic or random effects called
within source
118
What are the 3 assumptions required to use ANOVA
1. for each population, the response variable is normally distributed 2. The variance of the response variable (Q^2), is the same for all the populations 3. THe observations must be independent
119
If the sample sizes are equal, ANOVA is not sensitive to what
to the departure of the assumption of normally distributed populations
120
If the sample sizes are equal, ANOVA is not sensitive to what
to the departure of the assumption of normally distributed populations
121
If the means for the 3 pops are equal, we would expect what
the three sample means to be close together
122
the closer the 3 sample means are to one another, the
weaker the evidence we have for the conclusion that the pop means differ
123
If the variability among the sample means is "small", it supports what
HO
124
If the variability among the sample means is "Large" it supports what
Ha
125
The between treatments estimate of Q^2 is based on the assumption that the
null hypothesis is true (Ho is true)
126
Does the variation within each of the samples have an effect on the conclusion as well
yes
127
When a simple random sample is selected from each pop, each of the sample variances provides what
un unbiased estimate of Q^2
128
Why do we call Pooled or within-treatments estimate of Q^2
because each sample variance provides an estimate of Q^2 based only on the variation within each sample, the within-treatments estimate of Q^2 is not affected by whether the pop means are equal
129
When the samples sizes are equal, the within-treatments estimate of Q^2 can be obtained by computing what
the average of the indivdiual sample variances
130
Between-Treatments approach provides a good estimate of Q^2 only if what
the null hypothesis is true
131
If the null hypothesis is false in ANOVA, the between0treatments appaorch does what
overestimates Q^2
132
The within treatments approach provides what for the Q^2
provides a good estimate of Q^2 in whether the HO is true or the HA
133
If the null hypothesis is true in ANONVA, the two estimates will be what
similar and their ratio will be close to 1
134
When do we need to use multiple comparison procedures?
whenever we are performing a series of tests and are concerned with the overall level of significance attached to the whole experiment. When there are several tests, each at some level of significance a, although we still have control over the probability of TYpe I error for individual tests, we have no such control over the series of tests. Multiple comparison helps us out in this regard
135
What is an example of a multiple comparison procedure for ANOVA
LSD?
136
If the null hypothesis is true, MSTR and MST provide what
two independent and unbiased estimates of Q^2
137
The between treatments approach in ANOVA provides what
a good estimate of Q^2 ONLY if HO is true if the null hypothesis is true, then this estimate and the within treatments estimate will be similar and their ratio will be close to 1
138
The within treatments approach in ANOVA provides what
a good estimate of Q^2 regardless if HO is true of not
139
ANOVA is based on the development of two independent estimates of the common what
population variance of Q^2
140
What are the two independent estimates of variance in ANOVA
1. SSTR - B/w treatments | 2. SSE - Within Treatments
141
What are the two independent estimates of variance in ANOVA
1. SSTR - B/w treatments | 2. SSE - Within Treatments
142
By comparing SSTR and SSE what can we determine
whether the population means are equal
143
ANOVA is most used for how many pops
3 or more but can be used for two when testing the means of two pops are equal but doesn't usually happen (use the x^2 test instead)
144
How do you calculate the overall mean if the sample sizes are not all the same?
sum of all of the observations / the total # of observations
145
If H0 is true, MSTR provides
an unbiased estimate of Q^2
146
if the means of the k populations are not equal, MSTR is
not an unbiased estimate of Q^2
147
When does MSTR over estimate Q^2
When HO is rejected
148
What is MSE based on
based on the variation within each of the treatments; it is not influenced by whether the null hypothesis is true.
149
Is MSE influenced by the null hypothesis HO?
no
150
if the null hypothesis is false, the value of MSTR/MSE will be_______ be­cause MSTR _________Q^2.
inflated Overestimates
151
What is the test statistic in ANOVA
F = MSTR/MSE
152
What can SST be Partitioned into
Two different sums of squares: SSTR and SSE SSTR + SSE = SST
153
If sample sizes are not equal, what must you do for LSD
you must calculate LSD for each one
154
When the sample sizes are equal, what can you do with LSD
you only need to calculate one LSD
155
What is fisher's LSD used for
to determine where differences occur
156
Why is LSD referred to as a protected or restricted LSD test
because it is employed only used if we first find a significant F value by using ANOVA
157
In a One-way ANOVA (first part of chpt 13) we focus on test what
the effect of one independent variable
158
What might a one way ANOVA not do
may not be able to detect differences in means if the differences are caused by another factor than the independent variable we are considering
159
How can you overcome the limitation of one way ANOVA and testing the effect of the underlying factor is to do what
use the randomized block design
160
What does the randomized block design allow us to test
the effect of the independent variable as well as the block effect
161
What does a two way ANOVA allow us to do
test the effect of two or more independent variables and the interaction among these variables
162
What type of ANOVA do we use if the exeperimental units are homogenous
completely randomized design
163
What type of ANOVA do we use if the experimental units are heterogenous
blocking is often used to form homogenous groups
164
What is the purpose of the block design
to control some of the extraneous sources of variation by removing such variation from the MSE term
165
What does the randomized block design tend to provide
a better estimate of the true error variance and leads to a more powerful hypothesis test in terms of the ability to detect differences among treatment means
166
Experimental studies in business often involve experimental units that are ____________; as a result, we should use _________________
highly heterogenous randomized block design
167
Blocking in experimental design is similar to what
Stratification in sampling
168
what does nT represent
total sample size
169
THe experimental design described in block design is a ________design. What does this mean
complete block design the word complete indicates that each block is subject to all k treatments That is all controllers (Blocks) were tested with all 3 systems (treatments)
170
WHat is an incomplete block design
experimental designs in which some but not all treatments are applied to each block - not in this text
171
what is important to note about the F tests in the block design
we have an F value to test for treatment effects but not for blocks blocking was used to remove variation from the MSE term could use MSB/MSE and use the static to test for significance of the blocks
172
The error degrees of freedom are _______ for a randomized block design than for a completely randomized design because _______
are less b-1 degrees of freedom are lost for the b blocks
173
if n is small, the potential effects due to blocks can be
masked because the loss of error degrees of freedom; for large n, the effects are minimized
174
if n is small, the potential effects due to blocks can be
masked because the loss of error degrees of freedom; for large n, the effects are minimized
175
If we want to draw conclusions about more than one variable or factor what can we use
factorial experiment
176
what is factorial experiment
an experimental design that allows simultaneous conclusions about two or more factors
177
why do we use the term factorial in factorial experiment
because the experimental conditions include all possible combinations of the factors
178
what does interaction in factorial design mean
refers to a new effect that we can now study because we used a factorial experiment
179
If the interaction effect has a significant impact on what we studying (ie GMAT), we can conclude what
that the effect of the type of preparation program depends on the under grad college
180
in two-factor we do an Mean square for
Factor A : MSA = SSA/ a-1 Factor B: MSB = SSB/ b-1 Interaction: MSAB = SSAB / (a-1)(b-1) Error: MSE = SSE /ab(r-1)
181
In two-factor we calcualte F for
Factor A: MSA/MSE Factor B: MSB/MSE Interaction: MSAB/MSE
182
define factor
the independent variable of interest
183
define treatments
different levels of a factor
184
single-factor experiment - define
an experiment involving only one factor with k populations or treatments
185
define response variable
another word for the dependent variable of interest
186
define experimental units
the objects of interest in the experiment
187
define completely randomized design
an experimental design in which the treatments are randomly assigned to the experimental units
188
define ANOVA table
a table used to summarize the analysis of varaiance compuations and results
189
Define Partitioning
the process of allocating the total sum of squares and degrees of freedom to the various components
190
define Multiple comparison procedures
statistical procedures that can be sued to conduct satistical comparisons between pairs of population means
191
define comparisonwise TYpe 1 error rate
the provability of a type 1 error associated with single pairwise comparison
192
define Experimental TYpe I error rate
the probability of making a type 1 error on at least one of several pairwise comparisons
193
define blocking
the process of using the same or similar experimental units for all treatments.
194
What is the purpose of blocking
is to remove a source of variation from the error term and hence provide a more powerful test for a difference in population or treatment means
195
Define randomized block design
an experimental design employing blocking
196
define factorial experiment
an experiment design that allows simultaneous conclusions about two or more factors
197
Define replications
the number of times each experimental condition is repeated in an experiment.
198
Define interaction
the effect produced when the levels of one factor interact with the levels of an other factor in influencing the response variable
199
What does the two -way anova have the added advantage of
allowing us to study the interaction effect between the variables
200
With the TWO way ANOVA, when interpreting the results, it is a good idea to focus on what
the interaction effect first.
201
in a two way ANOVA, if the interaction effect proves to be significant, what do you do
a further detailed analysis can be applied to this aspect
202
In a two way ANOVA, if the interaction effect provides to insignificant, what can you do
focus can be directed on the main effects
203
What is a factor in ANOVA
the I.V.
204
THe factor is also a
variable of interest
205
A treatment is
different levels of a factor