4.3 Hypothesis Tests - χ²-Test Flashcards
normal approximation to the binomial distribution, χ² test for model fit, χ² test for independence
Motivation Behind the χ²-Test
-consider a test for attribute data
-assume we have n independent observations of a variate which can take K different values
-to build a model for these observations, we can use i.i.d random variables X1,…,Xn∈{1,..,K} with:
P(Xi=k) = pk
-for all i∈{1,..,n} and k∈{1,…,K}, and probabilities pk satisfy Σpk=1 where the sum is taken from k=1 to k=K
-since the observations are independent, the order does not matter and we only need to consider how often each class occurs, let:
Yk = |{i|Xi=k}| = Σ1{k} (Xi)
-where the sum is from i=1 to i=n for all k∈{1,..,K}
-if the model is correct, then Yk~B(n,pk) for all k∈{1,…,K} but since ΣYk=n (sum over k) the observations are not independent
Multinomial Distribution
Definition
-the joint distribution of (Y1,…,Yk) is called a multinomial distribution with parameters n and p1,…,pk
Normal Distribution as an Approximation to the Binomial
Overview
-for large n, we can approximate the distribution of Yk using a normal distribution
Normal Distribution as an Approximation to the Binomial
Proof
-use the fact that Yk is a sum of n independent random variables 1_{k}(Xi) and thus we can apply the central limit theorem
-the central limit theorem states that for any i.i.d sequence Zi, i∈ℕ or random variables with mean μ=E(Zi) and σ²=Var(Xi) we can conclude:
Y~N(np,np(1-p))
-approximately for large n
Normal Distribution as an Approximation to the Binomial
Summary
-for large n, we can approximate a B(n,p) distribution by a N(np,np(1-p))
Normal Distribution as an Approximation to the Binomial
Rule of Thumb
-the normal approximation for B(n,p) can be used if np≥5 and n(1-p)≥5
χ²-Test for Model Fit
Comparing Ho to the Observed Data
-assume that we have observed attribute data x1,…,xn∈{1,…,K} and we want to test the hypothesis Ho:P(Xi=k)=pk for all k∈{1,…,K}
-let:
yk = |{i|xi=k}| = Σ1_{k}(xi)
-be the sample count for class k∈{1,…,K}
-if Ho is true, we expect yk≈npk for all k
-so we can use:
c = Σ (yk-npk)²/n*pk
-sum over k=1 to K
-as a measure of how far away from Ho the data is
χ²-Test for Model Fit
Lemma
-assume Ho is true, let:
C = Σ (Yk-npk)²/npk
-sum from k=1 to k=K
-then C->χ²(K-1) as n->∞
χ²-Test for Model Fit
Lemma Proof for K=2
-we have: Y1 + Y2 = n -and p1 + p2 = 1 -sub into the formula for C -take the limit as n tends to infinity remembering to apply the normal approximation to the binomial
χ²-Test for Model Fit
Construct the Test for the Null Hypothesis
-if we write cn(α) for the (1-α)-quantile of the χ²(n)-distribution, then assuming Ho, we have:
P(C > c_{K-1}(α)) ≈ 1-α
-for large n, and thus we can reject Ho if the observed test statistic c satisfies c>c_K-1 (α)
χ²-Test for Model Fit
Summary
data: x1,…,xn∈{1,…,K}
model: X1,…,Xn∈{1,…,K} i.i.d with P(Xi=k)=pk for all i∈(1,…,n} ,k∈{1,…,K}
test: Ho:pk=πk for all k∈{1,…,K} vs H1:pk≠πk for one k∈{1,…,K}
test statistic: c=Σ (Yk-πk)²/nπk from k=1-K where yk=|{i|xi=k}|=Σ1_{k}(xi)
critical value: c_K-1(α), the (1-α)-quantile of the χ²(K-1)-distribution
χ²-Test for Model Fit
Rule of Thumb
-the χ²-test can be applied if n*πk≥5 for all k∈{1,…,K}
χ²-Test for Independence
Purpose
-tests whether two categorical variates are independent
χ²-Test for Model Fit
Number of Degrees of Freedom
K-1
χ²-Test for Independence
Description
a) create a table of the two variates
b) estimates the probability for each column
c) use these probabilities to estimate an expected outcome for each cell
d) compute the test statistic using these expected values and the observed values
e) find the critical value using the correct significance level, degrees of freedom = (no. of rows - 1)(no. of col.s - 1)
f) if the test statistic is less than the critical value, we can’t reject the null hypothesis that they are independent