unit 2 - chapter 10 - hypothesis testing with 2 samples Flashcards
2 sample test for the mean
TS = (xbar1 - mew1) - (xbar 2 - mew2) / s(xbar1 - xbar2)
TS = observed - expected / chance
changes in the formula
Group 1 and group 2
Bottom is standard error of the difference
TS = (xbar 1 - xbar2) - (mew 1 - mew 2) / s(xbar 1 - xbar 2)
TS = observed - expected / chance
4 facets of the null
The status quo
Everything is unrelated
No difference between groups
Everything arises from chance
two standard error of the difference terms
SED = square root of (s^2over1/n1 +s^2over2/n2)
Top formula is a non-pooled error term (unequal variances)
Bottom formula is pooled error term (equal variances)
pool
Pool = sharing
non-pooled error term (unequal variances)
pooled error term (equal variances)
general rules regarding SED
Starting point →
1. Assume the 2 population variances are NOT equal → non-pooled
How similar is similar ^
- If you know the 2 population variances are equal → pooled
- If n1 = or similar to n2 and s^2/1 = or similar to s^2/2 → pooled
As n increases s decreases
As n increases s goes to standard deviation
If n1 = to or similar n2 then s1 will be = to or similar to s2
N1 = N2 = N3 = N4
Xbar 1 = xbar 2 = xbar 3 = xbar 4
S1 and s2 decrease s3 and s4 increases
SED and SEM for 2 sample vs 1 sample
2 sample HT: SED → n1 and n2 and s^2/1 and s^2/2
1 sample HT: SEM = s/ square root of n
(image of 4 curves. 2 pairs. top is thinner/closer and bottom is thicker/further)
For which grouping are you likely to reject Ho and for which are you likely to fail to reject Ho?
1 and 2 reject
Lesser dispersion = less in common = Reject
3 and 4 FTR
Greater dispersion = more in common = FTR
Same sample size
More in common
Less than common means reject
assumptions for 2 samples test
- The null is true
- Random and independent samples
- Central limit theorem is satisfied
- Interval or ratio data
Note: all 2 independents samples
All tests in this chapter are t-tests
When might 2 datasets NOT be independent (dependent)
- 2 samples that are paired/matched based on related criteria
- Only works if it is the same basket of goods/dependent
- What they bought in store B is dependent on store A - 1 sample with a repeated measure
Before…
…Time changes…
…After - measures if what happened during this time changed sample
difference
- 2 samples that are paired/matched based on related criteria
- 1 sample with a repeated measure
Ho mew D = 0
H1 mew D =/= 0
D = difference
p value
The probability of attaining a value equal to or more extreme than the test stat, given the test assumptions are true and a sound test structure
p value and alpha
P value is probability getting TS
Compare p value to alpha
Compare value to value
P-value < a = Reject
P-value > a = FTR
prevision is good, no?
Given the test assumptions are true
For all HT – assumptions is that null is true
Rejecting is evidence against the null being true
But there are other assumptions: normality, random sampling etc
and a sound test structure?
And a sound test structure
Yea its precise but you can’t focus on precision and forget about other things
Accuracy of parents data
Is it meaningful to compare GPAs from 2020 to 1970 or even across majors
Students whose parents did not go to college?
chapter 7 tie in
As n increases, our distribution turns into a bell curve
The relationship between the CLT and 1 samples HT
Ho mew = ???
H1 mew =/= ???
X bar → TS = formula
Does this formula/sample come from H1
The relationship between the CLT and 2 samples HT
Compare two x bars
Mew 1 = mew 2 = same population = FTR
Example 1: Trindale
Mew 1 =/= mew 2 = different populations = Reject
To compare two means or two proportions, you work with two groups. The groups are classified either as
independent or matched pairs.
Independent groups consist of two samples that are independent, that is, sample values selected from one population are not related in any way to sample values selected from the other population.
Matched pairs consist of two samples that are dependent. The parameter tested using matched pairs is the population mean. The parameters tested using independent groups are either population means or population proportions of each group.
examples of independent population means
The comparison of two independent population means is very common and provides a way to test the hypothesis that the two groups differ from each other.
EX:
Is the night shift less productive than the day shift, are the rates of return from fixed asset investments different from those from common stock investments, and so on?
An observed difference between two sample means depends on both the means and the sample standard deviations.
Very different means can occur by chance if there is great variation among the individual samples.
The test statistic will have to account for this fact. The test comparing two independent population means with unknown and possibly unequal population standard deviations is called the Aspin-Welch t-test.
The test statistic (t-score) is calculated as follows:
TS = (xbar1 - mew1) - (xbar2 - mew2)/ s(xbar1-xbar2)
TS = observed - expected / chance
Where:
s1 and s2 (s(xbar1-xbar2)), the sample standard deviations, are estimates of σ1 and σ2, respectively and
σ1 and σ2 are the unknown population standard deviations.
𝑥⎯⎯1 and 𝑥⎯⎯2 are the sample means. μ1 and μ2 are the unknown population means.
degrees of freedom
The number of degrees of freedom (df) requires a somewhat complicated calculation. The df are not always a whole number. The test statistic above is approximated by the Student’s t-distribution with df as follows:
long ass equation omd
Typically we can never expect to know any of the
population parameters, mean, proportion, or standard deviation.
When testing hypotheses concerning differences in means
we are faced with the difficulty of two unknown variances that play a critical role in the test statistic.
When using a hypothesis test for matched or paired samples, the following characteristics may be present:
Simple random sampling is used.
Sample sizes are often small.
Two measurements (samples) are drawn from the same pair of individuals or objects.
Differences are calculated from the matched or paired samples.
The differences form the sample that is used for the hypothesis test.
Either the matched pairs have differences that come from a population that is normal or the number of differences is sufficiently large so that distribution of the sample mean of differences is approximately normal.
In a hypothesis test for matched or paired samples
subjects are matched in pairs and differences are calculated. The differences are the data. The population mean for the differences, μd, is then tested using a Student’s-t test for a single population mean with n – 1 degrees of freedom, where n is the number of differences, that is, the number of pairs not the number of observations.
the null and alternative for matched or paired samples (differences) are:
Ho: mewd = 0
H1: mewd =/=0