Probability Distributions: Chi Square Distribution Flashcards
Chi Square (χ2) Distribution
- Best method to test a population variance against a known or assumed value of the population variance.
- Continuous distribution with degrees of freedom
- Describe the distribution of a sum of squared random variables
- Also used to test the goodness of fit of a distribution of data, whether series are independent, and for estimating confidences surrounding variance and standard deviation for a random variable from a normal distribution.
Chi Square Statistics
- Chi square may be skewed to the right or with a long tail towards the large values of the distribution.
- The overall shape of the distribution will depend on the number of degrees of freedom in a given problem.
- The degrees of freedom are 1 less than the sample size.
Chi Square Properties
- The mean of the distribution is equal to the number of degrees of freedom: μ=ϑ
- The variance is equal to two times the number of degrees of freedom: σ2 = 2*ϑ
- When the degrees of freedom are greater than or equal to 2, the maximum value for Y occurs when χ2=ϑ-2
- As the degrees of freedom increases, the chi square curve approaches a normal distribution
- As the degree of freedom increases, the symmetry of the graph also increases
- Finally, it may be skewed to the right, and since the random variable on which it is based is squared, it has no negative values. As the degrees of freedom increases, the probability density function (pdf) appears symmetrical in shape
Chi Square (χ2) Hypothesis Test
- Usually the objective of the six sigma team is to find the variation of the output, not just the mean population.
- Most importantly, the team would like to know how much variation the production process exhibits about the target to see what adjustments are needed to reach a defect-free process.
- A comparison between several sample variances, or a comparison between frequency proportions, the standard test statistic called chi square χ2 test will be used.
- The distribution of the chi square statistic is called the chi square distribution
Types of Chi Square Hypothesis Tests
- Chi-Square Test of Independence
- Chi Square Test of Variance
Chi Square Test of Independence
- Chi Square Test of Independence determines whether there is an association between two categorical variables (like gender, course selection)
- For Example:
- Chi Square Test of Independence examines the association between one category like gender (male and female) and the other category like percentages of absenteeism in school
- Chi Square Test of Independence is a non-parametric test
- In other words, the assumption of normality is not required to perform the test
Chi square test utilized a contingency table to analyze the date. Each row show the categories of one variable. Each column shows the categories of another variable. Each variable must have two or more categories. Each cell reflects the total number of cases for a specific pair of categories
Assumptions of Chi-Square Test of Independence
- Variable must be nominal or categorical
- Category of variables are mutually exclusive
- The sampling method to be a simple random sampling
- The data in the contingency table are frequencies or count
Contingency Tables
- 2-way classification table containing frequencies of how often things appear and can be used to determine if 2 variables are independent or are significantly associated.
- Since the actual measured may not agree with the theoretical values predicted you can use the Chi Square calculation to make the determination
- Additionally, a correlation coefficient can be calculated.
Steps to perform Chi Square Test of Independence
- Step 1: Define the null hypothesis and alternative hypothesis
- Null hypothesis (H0): There is no association between the two categorical variables
- Alternative Hypothesis (H1): There is a significant association between two categorical tables
- Step 2: Specify the level of significance
- Step 3: Compute χ2 statistic (See Attached)
- Step 4: Calculate the degree of freedom= (numbers of rows -) (number of columns - 1) = (r-1) * (c-1)
- Step 5: Find the critical value based on degrees of freedom
- Step 6: Finally, draw the statistical conclusion: If the test statistic value is greater than the critical value, reject the null hypothesis, and hence we can conclude that there is a significant association between two categorical variables.
Chi Square Test of Independence Example
Part 1
Chi Square Test of Independence Example
Part 2
Chi Square Test of Independence Example
Part 3
Chi Square Test - Comparing Variances
Part 1
- The chi square test is best option for two applications:
- Case I: Comparing variances when the variance of the population known
- Case II: Comparing observed and expected frequencies of test outcomes when there is no defined population variance
Chi Square Test - Comparing Variances
Part 2
Chi Square Test - Comparing Variances
Part 3