4.2 Hypothesis Tests - T-Test Flashcards
chi-squared distribution, t-distribution, two-sided t-test, one-sided t-test, large sample size, two-sample t-test, p-values
What is the t-test used for?
-in situations where the exact variance of observed samples is unknown, we can test the hypothesis Ho:μ=μo against the alternative H1:μ≠μo using the test statistic |t|
Two Sided T-Test Test Statistic |t|
Definition
|t| = 1/√n * Σ (xi-μo)/σx^
- where σx^ is the sample standard deviation
- since σx^ is not a constant bu depends on the data, even if we compute the t from samples which follow a normal distribution, the t value itself will not be normally distributed
T-Test
Defintion
- the t-test rejects Ho:μ=μo if |t|>c for a critical value c
- before we can use the t-test, we need to choose a value of c that will keep the probability of type I errors below α
What do we use to derive the critical values for the t-test?
-the chi-squared probability distribution
Chi-Squared Distribution
Definition
- let X1,…,Xv ~ N(0,1) be i.i.d
- then the distribution of ΣXi² is called the χ²-distribution with v degrees of freedom, the sum is taken from i=1 to i=v
- the distribution is denoted by χ²(v)
Chi-Squared Distribution
Expectation and Variance Lemma
-let Y~χ²(v), then:
E(Y) = v
and:
Var(Y) = 2v
Chi-Squared Distribution
Expectation and Variance Lemma Proof
-write Y as the sum of X1,…,Xn~N(0,1) squared
CDF of the Chi-Squared Distribution in R
-the R command:
pchisq(x,v)
-gives the value Ф(x) of the CDF of the χ²(v) distribution
α-Quantile of the Chi-Squared Distribution in R
-the R command:
qchisq(α,v)
-can be used to obtain the α-quantile of the χ²(v) distribution
Chi-Squared Theorem
-let X1,…,Xn~N(μ,σ² ) i.i.d. and consider:
Y = Σ (Xi,X_)² /σ²
-where the sum is taken from i=1 to i=n and X_ is the sample average of Xi
-then:
a) Y~χ²(n-1)
b) the random variables X_ and Y are independent
T-Distribution
Definition
-let Z~N(0,1) and Y~χ²(v) be independent
-then the distribution of:
T = Z / √(Y/v)
-is called the t-distribution with v degrees of freedom
-this distribution is denoted by t(v)
T-Test Sample of Xi Random Variables Normally Distributed Lemma
-let X1,…,Xn~N(μ,σ²), and let:
T = 1/√n * Σ (Xi-μo)/σx^
-where σx^ is the sample standard deviation of Xi
-then T~t(n-1)
CDF of the T-Distribution in R
-the R command pt(x,v) gives the value Ф(x) of the CDF of the t(v)-distribution
α-Quantile of the T-Distribution in R
-the R command qt(x,v) gives the value α-quantile the t(v)-distribution
Two-Sided T-Test
Description
-let x1,…,xn be observations of Xi~N(μ,σ²) with unknown variance σ² and assume that we want to test Ho:μ=μo against H1:μ≠μo
Two-Sided T-Test
Test Statistic
-as a test statistic we use |t|, where:
t = 1/√n * Σ (xi-μo)/σx^
= √n * (x_-μo)/σx^
Two-Sided T-Test
Critical Value
- we know that under Ho, the test statistic follows a t(n-1) distribution
- thus we can use critical value tn-1(α/2), the (1-α/2) quantile of t(n-1)
Two-Sided T-Test
Summary
data: x1,…,xn∈R
model: X1,…,Xn~N(μ,σ²) i.i.d
test: Ho:μ=μo vs H1:μ≠μo
test statistic: |t| = 1/√n * Σ |xi-μo|/σx^ = √n * |x_-μo|/σx^
critical value: tn-1(α/2), the (1-α/2) quantile of t(n-1)
How does the t-test differ from the z-test?
- the critical value in the t-test depends not only on the significance level α, as in the z-test, but also on the sample size n
- we have to consider quantiles of the t-distribution with v=n-1 degrees of freedom
One-Sided T-Test
Description
-assume X1,..,Xn~N(μ,σ²) with unknown mean μ and unknown variance σ²
-we want to test Ho:μ≤μo against H1:μ>μo
-we compute the same test statistic as for the two-sided test but without the modulus:
t = 1/√n * Σ (xi-μo)/σx^
-but now we reject Ho if t>tn-1(α) instead of α/2
One-Sided T-Test
Summary
data: x1,…,xn∈R
model: X1,…,Xn~N(μ,σ²) i.i.d
test: Ho:μ≤μo vs H1:μ>μo
test statistic: t = 1/√n * Σ (xi-μo)/σx^ = √n * (x_-μo)/σx^
critical value: tn-1(α), the (1-α) quantile of t(n-1)
Large Sample Size T-Test
Description
-assume that Xi are not normally distributed but are still independent with Var(Xi)=σ²
Large Sample T-Test
Summary
data: x1,…,xn∈R, n large
model: X1,…,Xn i.i.d with mean μ and variance σ²
Two Sample T-Test
Description
-can be used to compare the mean of two independent populations with unknown variances
-assume we have observed x1,…,xn ad y1,…,yn
-the joint variance can be estimated using:
σp^ = 1/(n+m-2) (Σ(xi-x_)² + Σ(yi-y_)² )
Two Sample T-Test
Summary
data: x1,..,xn,y1,..,ymR
model: X1,…,Xn~N(μx,σx²), Y1,…,Ym~N(μy,σy²) independent
test: Ho:μx=μy vs H1:μx≠μy
test statistic: |t| = Σ |x_-y_|/√(σx^²(1/n + 1/m))
critical value: tn+m-2(α/2), the (1-α/2) quantile of t(m+m-2)
p-values
- if we call a general test statistic s and denote the density of s by φ
- let our observed value of the test statistic be s*
- the p-value represents the area of the region from the top of the distribution down to s* i.e. the probability under Ho that s≥s*
- we see that we reject Ho whenever we have s>q_α