chapter 8 Flashcards
What are the Key properties of the t-Distribution?
- Mean, median, and mode of the t-distribution are equal to 0.
- Bell shaped and symmetric about the mean.
- Total area under a t-curve is 1.
- Tails are “thicker” vs. normal distribution
(especially for smaller degrees of freedom). - The SD of the t-distribution varies with the sample size, but it is greater than 1.
- The t-distribution is a family of curves, each determined by a parameter called the degrees of freedom (d.f.), which are the number of free choices left after a sample statistic such as is calculated. In a t-distribution to
estimate a population mean, the d.f. are equal to one less than the sample size or n – 1 - As d.f. increases, the t-distribution approaches the normal distribution. For ≥ 30 degrees of freedom the t-
distribution is essentially the same as the standard normal distribution (the z-distribution).
Two sample t test variences =
Standardized test statistic (how to test difference between means bottom part changes depending on if variances are = between two groups or not)
t=(x1-x2)-(u1-u2)
/ sx1-x2
u’s typically =0
When you have = variences d.f =?
n1+n2 -2
Each of the following is a condition to use the z-test to test the difference between two population means except…
σ1 and σ2 are known.
The samples are random.
The samples are independent.
The populations are normally distributed.
The populations do not need to be normally distributed as long as each sample size is at least 30.
The samples are required to be random and independent, and the σ1 and σ2 must be known in order to use the z=test.
A researcher wishes to determine if body temperature changes throughout the day. To test this claim, seven subjects were selected and their body temperatures were measured at 8am and again at 4pm. Then the differences (4pm temperature - 8am temperature) were computed.
The statistics are as follows: d¯=0.871, sd=1.234
The researcher then tested the claim μd>0
using a 1% level of significance and got a P-value of 0.055.
What is the appropriate conclusion?
There is not enough evidence to support the claim that body temperature rises during the day.
The claim is an alternative hypothesis, which means the conclusion is worded with supporting the claim.
Since P-value > α
, the null hypothesis can’t be rejected. Therefore, there is not enough evidence to support the claim.
A food manufacturer claims that eating its new cereal as part of a daily diet lowers total blood cholesterol. To test this claim, seven patients had their cholesterol measured before starting to include the cereal into their diets for one year.
Assuming that the samples are collected at random, which hypothesis test should be used to test this claim?
t-test for the difference between population means (dependent samples)
Since this is a before/after situation in which the same subjects are used twice, the samples are dependent. Since the means are being compared, the t-test for the difference between population means with dependent samples would be used to test the claim.
What does it mean if p has a greater level of significance than a
null hypothesis cannot be rejected
A pet association claims that the mean annual costs of food for dogs and cats are the same. A sample of 16 dogs gave an average of $255 and a standard deviation of $30, while a sample of 18 cats yielded a mean of $231 and a standard deviation of $18.
Assuming that the samples are collected at random, which hypothesis test should be used to test this claim?
t-test for the difference between population means (independent samples)
What is a pooled proportion?
The pooled proportion is used in hypothesis testing — specifically for comparing two population proportions (like in a two-sample z-test for proportions). It combines the successes from both samples into one overall proportion. Here’s how to calculate it:
Formula:
𝑝^=𝑥1+𝑥2/𝑛1+𝑛2
Where:
x1 = number of successes in sample 1
𝑥2 = number of successes in sample 2
𝑛1 = size of sample 1
n 2 = size of sample 2
A food manufacturer claims that eating its new cereal as part of a daily diet lowers total blood cholesterol. To test this claim, seven patients had their cholesterol measured before starting to include the cereal into their diets for one year.
Assuming that the samples are collected at random, which hypothesis test should be used to test this claim?
t-test for the difference between population means (dependent samples)
Since this is a before/after situation in which the same subjects are used twice, the samples are dependent. Since the means are being compared, the t-test for the difference between population means with dependent samples would be used to test the claim.
Population proportions vs pooled prop vs sample prop
p1and p2: are the population proportions being compared in the null and alternative hypotheses.
p1^ and p2^:are the sample proportions.
p¯: is pooled proportion (calculated from sample success and trials), and q¯=1−p¯
p>a
In hypothesis testing, if p (the p-value) is greater than α (the significance level, often 0.05 or 0.01), it means:
The result is not statistically significant.
You fail to reject the null hypothesis
There isn’t enough evidence to support the alternative hypothesis
The observed data could reasonably happen by chance, assuming the null hypothesis is true.
t test vs z test
A college counselor wants to determine if there is a difference between the proportions of males who are employed based on their educational background.
Let p1= the proportion of males aged 18-24 who are employed and have taken some college but have not earned a bachelor’s degree.
Let p2= the proportion of males aged 18-24 who are employed and have earned at least a bachelor’s degree.
The counselor tested the claim that p1 < p2 using α = 0.05, and the P-Value for the test is 0.1124. What can the counselor conclude?
Since the claim is the alternative hypothesis, the conclusion is worded around supporting the claim.
Since P-value is greater than the level of significance (0.1124 > 0.05), the null hypothesis cannot be rejected, which means that the alternative hypothesis cannot be supported.
The correct conclusion is “There is not enough evidence to support the claim that the proportion of males aged 18-24 who are employed and have taken some college but have not earned a bachelor’s degree is lower than the proportion of males aged 18-24 who are employed and have earned at least a bachelor’s degree.”
A researcher claims that a smaller proportion of people wear seatbelts in Oregon than in Washington. After collecting samples, he found that 947 of 1056 surveyed in Washington wear a seatbelt, while 865 of 1078 in Oregon wear a seatbelt.
Which hypothesis test should be used to test this claim?
z-test for the difference between population proportions
The researcher is comparing proportions from two independent samples (Oregon and Washington).
The goal is to see if the proportion of seatbelt use in Oregon is smaller than in Washington — this suggests a one-tailed test (specifically a left-tailed test).
The samples used in the before/after samples are dependent, therefore the hypothesis contain…
μd
A two-sample t-test is used to test the difference between two population means μ1 and μ2 when…
σ2 and σ2 (pop. Standard deviations) are unknown but you have both s1 and s2,
the samples are random,
the samples are independent, and
the populations are normally distributed or both n1 ≥ 30 and n2 ≥ 30.
When these conditions are met, the sampling distribution for the difference between the sample means is approximated by a t-distribution with mean μ1 - μ2 (which you assume to be zero). So, you can use the two-sample t-test to test the difference between the population means. You also require standard error and the degrees of freedom
When to use a z test…
What assumptions are needed?
- The population standard deviations are known (Both01 and o2 )
- The samples are randomly selected.
- The samples are independent.
- The populations are normally distributed or each sample size is at
least 30 (n1 >_ 30 and n2 ≥ 30).
When these assumptions are met, the sampling distribution for തx1 − തx2,
the difference of the sample means, is a normal distribution with mean
and standard error as shown (described in more detail next…).
Z test vs t test
Sample Size:
Z-test: Large sample (𝑛≥30)
T-test: Small sample (n<30)
Standard Deviation:
Z-test: Population SD known
T-test: Population SD unknown (use sample SD)
Distribution:
Z-test: Normal distribution
T-test: T-distribution (fatter tails, more variability)
Uses:
Z-test: Large samples, population comparisons
T-test: Small samples, paired data, unknown SD
Standardized test statistic vs test statistic
Test Statistic Formula:
Z = (𝑝^ - p0) / √(pq / n)
Where:
p^ = Sample proportion
p0 = Population proportion (under the null hypothesis)
n = Sample size
Finding the Sample Proportion:
If the sample proportion is not given, calculate it using:
p-hat = x / n
Where:
x = Number of successes in the sample
n = Sample size
The test statistic
Standardized use the whole formula
x1-x2
Note that if the claim is the null hypothesis use the “….” wording, and if the claim is the alternative hypothesis use the “….” wording.
reject
support
perform a two-sample t-test for the difference between two population means, the steps include….
first, state the hypotheses and identify the claim. Then specify a level of significance. Next, determine the degrees of freedom. Find the critical value(s) and identify the rejection region(s). Then find the standardized test statistic. Finally, make a decision and interpret it in the context of the original claim.
find critical values t test vs z test
Critical Values: Z-test vs. T-test
Z-test:
Used when: Population standard deviation (σ) is known or 𝑛>30
Distribution: Standard Normal (mean = 0, SD = 1).
Critical Values:
Two-tailed test (𝛼=0.05: ±1.96
One-tailed test (𝛼=0.05): 1.645
1.645 (right-tailed), −1.645 (left-tailed).
Usage: Known 𝜎 , or large sample size.
T-test
Used when: Population standard deviation (σ) is unknown or n≤30.
Distribution: T-distribution (heavier tails).
Critical Values:
Two-tailed test (α=0.05): Varies with degrees of freedom (df).
One-tailed test: Varies by df.
Degrees of Freedom (df):
df=n−1 for single sample.
Usage: Small sample size, unknown
σ.
Key Differences
Z-test: Normal distribution, fixed critical values.
T-test: T-distribution, critical values depend on df.