Reading Quiz 13 Flashcards
two-sample problem characteristics
- the goal of inference is to compare the response to two treatments or to compare characteristics of two populations
- have separate sample from each treatment of population
- responses of each group are independent of those in other group
conditions for two-sample problem test comparing two means
two independent SRSs, each drawn from a normally distributed population (or could come from same population so clearly needs to be normally distributed)
significance tests and confidence intervals for the difference between the means μ1 and μ2 of two normal populations start from the difference…
xbar1 - xbar2 between the two sample means
due to central limit theorem, resulting procedures are approximately correct for other population distributions when sample sizes are large
two-sample z test for means
draw independent SRSs of sizes n1 and n2 from two normal populations with parameters μ1, σ1, μ2, σ2
statistic has standard normal distribution
two-sample z statistic for means
z = ( (xbar1 - xbar2) - (μ1 - μ2) ) / (sqrt( (sigma 2 1 / n1) + (sigma 2 2 / n2) ) )
two-sample t statistic for means
t = ( (xbar1 - xbar2) - (μ1 - μ2) ) / (sqrt( (s 2 1 / n1) + (s 2 2 / n2) ) )
doesn’t exactly have a t distribution; good approximations are available with calculators
for conservative inference procedures to compare μ1 and μ2
use the two-sample t statistic for means with the t(k) distribution
number of degrees of freedom, k, is the smaller of n1 - 1 and n2 - 1
for more accurate probability values
use the t(k) distribution with degrees of freedom estimated from the data
significance tests for Ho: μ1 = μ2 are based on
t = (xbar1 - xbar2) / (sqrt( (s 2 1 / n1) + (s 2 2 / n2) ) )
significance tests for Ho: μ1 = μ2 have a true p-value
no higher than that calculated using the conservative degrees of freedom
the level C confidence interval for μ1 - μ2 given by
(xbar1 - xbar2) ± t*((sqrt( (s 2 1 / n1) + (s 2 2 / n2) )
has confidence level at least C if use the more conservative number of degrees of freedom
when we want to compare the proportions p1 and p2 of successes in two populations, the comparison is based on the difference
phat1 - phat2 between the sample proportions of successes. when the sample sizes n1 and n2 are large enough, we can use z procedures because the sampling distribution of phat1 - phat2 is close to normal
approximate level C confidence interval for p1 - p2
(phat1 - phat2) ± z* (sqrt ( (phat1qhat1/n1) + (phat2qhat2/n2)))
significance tests for Ho: p1 = p2 use the
combined sample proportion and the z statistic for a two-sample z test for proportions
p-values can be determined using the standard normal table
combined sample proportion
phat c = (count of successes in both samples combined)/ (count of individuals in both samples combined) = (X1 + X2) / (n1 + n2)
two-sample z test for proportions
z = ( (phat1 - phat2) - (pnot1 - pnot2) ) / (sqrt (phatcqhatc * ((1/n1) + (1/n2))) )
significance tests for Ho: p1 - p2 = 0 are based on
z = (phat1 - phat2) / (sqrt (phatcqhatc * ((1/n1) + (1/n2))) )
how to check normality conditions for two-sample proportion confidence interval
n1phat1, n1qhat1, n2phat2, n2qhat2 are all greater than or equal to 5 (or 10)
how to check normality conditions for two-sample proportion significance test
n1phatc, n1qhatc, n2phatc, n2qhat2 are all greater than or equal to 5 (or 10)
One researcher randomly samples two groups from a population, and gives training to one and not the other. The researcher uses a t procedure to compare the test scores of the two groups. Another researcher samples a group from the population, and gives a test to the group two times, once before training and once after. The researcher uses a t procedure to compare the results after testing with those before testing. How are these two situations different, and what different statistical procedures should they result in?
A. In the first case, the samples are independent of one another, and in the second, they are not. So in the first case, you use a 2 sample t to study the difference in the means. In the second case, you create a new variable, the post-score minus pre-score, and use a 1-sample t to study the mean of the differences (this is a paired t test).
Suppose someone were to draw many pairs of samples from two populations, and compute the difference between the sample means for each pair. What would the mean of this difference approach as the number of samples drawn approached infinity?
the difference in population means
The fact that the mean of the difference in sample means approaches the difference in population means as the number of samples gets larger is a long way of saying that the difference in sample means is a(n) ____ estimator of the difference in population means.
unbiased
True or False: just as the difference in sample means estimates the difference in population means, the difference in sample standard deviations estimates the population standard deviation of the difference between two means.
A. This is a triple false! First, what you would combine would be variances, not standard deviations. Second, to find the variance of the difference between two random variables you add the variances; you don’t subtract them. Third, the sample variances would have to be divided by n to estimate the variance of the sample mean.
- True or False: the variance of the difference between two population means is estimated by s12/n1 + s22/n2, where s1 and s2 are the sample standard deviations (and thus s12 and s22 are the sample variances) and where
n1 and n2 are the sample sizes.
true