Stats Topic 5 Flashcards
Single Categorical Variable
- Frequency data represents the count of observations in each category.
- The distribution is often visualized using bar charts or frequency tables.
Joint Distribution for Two Categorical Variables
- Examines how data is distributed across combinations of two categorical variables.
- Represented using contingency tables or stacked bar charts.
Chi-Square Tests
Chi-square tests are used to determine whether observed categorical data differs from what is expected.
Chi-Square Goodness-of-Fit Test:
- Tests whether the observed distribution of a single categorical variable matches an expected distribution.
- Example: Checking if the color distribution of M&Mโs matches the expected proportions provided by the company.
- Formula:
๐: is the observed frequency and
๐ธ: is the expected frequency.
Chi-Square Test of Independence
Determines whether there is an association between two categorical variables.
Example: Testing if a personโs preference for a drink is related to their gender.
Degrees of Freedom (df) calculation
where
๐: is the number of rows and
๐: is the number of columns in the contingency table.
Statistical Significance
A p-value less than 0.05 leads to rejecting the null hypothesis, indicating a significant association or deviation from the expected distribution.
Reporting the Results
When reporting the chi-square test results, include:
1) The type of test used.
2) The null and alternative hypotheses.
3) The observed and expected frequencies.
4) The chi-square statistic and degrees of freedom.
5) The p-value and conclusion (whether to reject or fail to reject the null hypothesis).