4.1 Hypothesis Tests - Z-Test Flashcards
single observation, multiple observations, comparing the mean of populations, one-sided test, large sample size, z-test in R
What are statistical tests used for?
-statistical test answer the question of whether or not data is compatible with a given statistical model
Z-Test
Definition
-let z be an observation of Z~N(μ,1)
-the z-test, with significance level α for the null hypothesis
Ho : μ=0
-with alternative:
H1 : μ≠0
-rejects Ho if and only if:
|z| > q_α/2
-where q_α/2 is the (1-α/2) quantile of N(0,1)
-i.e. the null hypothesis Ho is rejected if the observation z falls in the critical region at either the lower or upper end of the expected values
Z-Test
Wrong Rejection Probability Lemma
- assume that Ho is true, i.e. that the observed data is a random sample from N(0,1)
- then the z-test with significance level α wrongly rejects Ho with a probability of α
Z-Test
Wrong Rejection Probability Proof
-assume Z~N(0,1)
-then:
P(Ho is rejected) = P(|Z|>q_α/2)
= P(Z < -q_α/2 OR Z > q_α/2)
= P(Z < -q_α/2) + P(Z > q_α/2)
-since N(0,1) is symmetrical, these two probabilities are equal so we can just write:
P(Ho is rejected) = 2P(Z>q_α/2)
= 2(1 - P(Z≤q_α/2))
= 2(1 - (1-α/2))
= α
Z-Test
Test Statistic, Critical Value and Critical Region
- for the z-test the modulus of the observation |z| is called the test statistic
- the critical value is q_α/2
- the interval (q_α/2,∞) is called the critical interval
- using these terms, Ho is rejected if the test statistic exceeds the critical value or equivalently, if the test statistic falls into the critical region
What do we need to know to apply the z-test?
-the numerical values for the (1-α/2) quantiles of the standard normal distribution
Z-Test
Type I Errors
- if Ho is true, the test might wrongly reject Ho
- this outcome is a type I error
- they occur with a probability of α
Z-Test
Type II Errors
- an error occurs if H1 is true but the test does not reject Ho (despite Ho being false)
- this outcome is a type II error
- the probability of this error depends on the value of μ
- if μ≈0 these errors occur withe a probability of approximately 1-α
- as μ gets further away from 0, the probability of hitting the interval [-q_α/2,q_α/2] and thus the probability of a type II error decreases to 0
Z-Test
Errors and Choosing α
- choosing small values of α reduces the probability of type I errors
- but this also reduces the size of the critical region and increases the chance of type II errors
- the two types of error probabilities must be balanced when choosing α
Z-Test
Multiple Observations Description
- data sets being tested in practice will consist of more than one sample
- in this case we can apply the z-test by considering the sample average
- assume that we have observed a sample x1,…,xn which can be described by the statistical model X1,…,Xn~N(μ,σ²) i.i.d for known variance σ²
- we want to test the hypothesis Ho : μ=μo against the alternative H1:μ≠μo
Z-Test
Multiple Observations Lemma
-let μ, μo ∈ R and X1,…,Xn~N(μ,σ²) be i.i.d
-define:
Z = 1/√n Σ Xi-μo/σ
-then:
Z~N(√n(μ-μo) , nσ²)
Z-Test
Multiple Observations Lemma Proof
-we have X1,…,Xn~N(μ,σ²)
-and we know then that:
Σ(Xi-μo)~N(n(μ-μo),nσ²)
-dividing by σ√n gives the result
Z-Test
Multiple Observations Null Hypothesis
- using the multiple observations lemma, we see that the hypothesis Ho:μ=μo is equivalent to the hypothesis that Z has mean 0
- we also know that Z has variance 1 and thus we can apply the z-test on Z to decide whether to reject Ho or not
Z-Test
Multiple Observations Applying the Test to Z
-we reject Ho at confidence level α if and only if:
|Z| = |1/√n Σ Xi-μo/σ| > q_α/2
-where q_α/2 is the (1-α/2) quantile of the standard normal distribution
-for this test we need to know the value of the sample variance σ² in order to compute the test statistic |Z|
Z-Test
Multiple Observations Manually Computing the Test Statistic
-an alternative representation can make the test statistic easier to compute:
Z = 1/√n Σ Xi-μo/σ
= √n * (x^-μo)/σ
-where x^ is the sample average of xi