Tests Involving Several Poulation Means Flashcards

1
Q

Tests Involving Several Population Means
Aim: To detect differences among several population means
Hypotheses
H_0: µ_1= µ_2 = µ_3 = µ_4 = . . . = µ_k = 0
i.e. the means are the same. In other words, there is no difference (zero difference) between the means
H_1: µ_1 ≠ µ_2 ≠ µ_3 ≠ µ_4 ≠ . . . ≠ µ_k ≠ 0
i.e. at least one of the equalities does not hold

Thus, we can formulate the hypotheses as:
H_0: There is no difference between the means
H_1 : At least one of the equalities does not hold

Assumptions of ANOVA

?

A

Sampling from each of the k populations must be independent and random
The population must be normally distributed with mean µ (not necessarily equal) and variance σ^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sources of Variation:
ANOVA test is based on the comparison of the amount of variation in each of the k treatments. If variation from one treatment to the next is significantly high, then it can be concluded that the treatments have dissimilar effects on the population. There are three types or sources of variation. They are:

A

i) Total variation: this refers to the variation among the total of all n observations
(ii) Between sample means variation; which refers to the variation between the various treatments administered to all the observations (population)
(iii) Within sample means, which refers to the variation within any one given treatment (sample)

Note: it is by comparing these different sources of variation that ANOVA can be used to test for the equality of means of different populations. Any difference that the treatments may have will be detected by a comparison of these forms of variation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The Principle behind ANOVA

A

To determine if the different treatments have different effects on their respective populations, a comparison is made between the variation between sample means and the variation within samples.

The variation between samples can be caused, in part, by differences in the treatments administered to them. On the other hand, the variation within a given sample is caused by random factors. Variations within samples are independent of the treatments (since all observations within a sample get the same treatment). Thus, within sample variations are results of randomised sampling errors within the sample

ANOVA is used to measure this difference between “variation between samples” and “variation within samples”. It is a ratio of the “variation between samples” to the “variation within samples” and it is based on the F-Ratio

Note: When population means are different, a treatment effect is present and the deviation between samples will be large compared to the error deviation within samples. Thus, the F – value which is the ratio of the treatment variation to the error variation will rise.

The total variation is the sum of the variation caused by different treatments and the variation caused by the random error elements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Once the F test is significant, it means that the means are not the same. We may wish to find out where the differences lie, i.e. the means that are actually different. This is accomplished by conducting a Post Hoc test using Turkey or LSD formula
Turkey formula:

A

q_α √(MSE/a)
LSD = √((2 MSE F_α)/a)
Where
MSE = Mean square error
α = level of significance
q_α = is obtained from the studentized range distribution table with k and n-k degrees of freedom
F_α = F value at the specified level of significance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Note:
F_calculated = MSB/WSW
What are the formulas

A
MSB    =  SSB/df,   here df   = k-1
MSW    =  SSW/df,  here df  = n-k
Where
  MSB    =  mean square between means
  MSW    =  Mean square within means
  SSB    =  sum of squares between means
  SSW    =  sum of squares within means
  .df    =  degrees of freedom

The major task involves the computation of the sum of squares (SS)

Computation of sum of squares
SSB = (∑▒〖Ti .〗^2 )/nj - 〖T. .〗^2/n for row treatment; and
= (∑▒〖T. j〗^2 )/nj - 〖T. .〗^2/n
SSW = ∑_i▒∑_j▒〖X_ij〗^2 - (∑▒〖Ti .〗^2 )/nj for row treatment; and
= ∑_i▒∑_j▒〖X_ij〗^2 - (∑▒〖T. j〗^2 )/nj for column treatment

In simple terms
Ssb = sum of column square divided by 4 
\+ sum of column square divided by 4 
 .....+sum of column square divided by 4 
Minus sum of sum of columns 
           ———————————-
               Total number of columns 

Ssw = column1 square + column2 square + column 3 square +……
Minus sum of sum of columns square divided by total number of columns

ANOV|A Table
Source of Variation df SS MS Test Ratio
Between Means (Treatment) k- 1 SSB or SSTR MSB or MSTR F = MSB/MSW
Within Means (Error) n-k SSW or SSE MSW or MSE
Total n-1 SST

Msb= ssb/df 
Msw= ssw/n-k

F= msb/msw

How well did you know this?
1
Not at all
2
3
4
5
Perfectly