Week 9, 10, 11, 12 Flashcards
Expected contingency table
The contingency table you expect from a population where the null hypothesis is true.
Hypothesis testing for categorical data is based on __
contingency tables
Difference between 1 way and 2 way expected contingency tables
1 way contingency looks at if there are differences in the counts between levels of a variable
-Expectation: Counts are distributed equally among cells - take total count and divide by amount of levels
2 way contingency looks at if the counts are independent between the variables
-Expectation: Counts are distributed independently among cells
How to calculate independence for 2 way tables
Calculating independence requires first calculating the marginal distribution as proportions
The expected table is the product of the row and column proportions for each cell, multiplied by the table total
What does the chi-square score measure
Measures the distance between observed and expected contingency tables. It works by calculating the squared difference between the two tables on a cell-by-cell basis
The four steps in calculating the chi-square score
Take the difference between each observed and expected cell
Square the difference
Divide by the expected value
Sum over all cells in the table
Chi Square Distribution
The null distribution for hypothesis testing with categorical data
It is the distribution of chi-square scores you would get from sampling an imaginary statistical population where the null hypothesis was true
ONLY POSITIVE VALUES
Degrees of freedom for 1 way tables vs 2 way tables
1 way: df=k-1
Number of cells minus 1
2 way: df=(r-1)(c-1)
(rows-1 multiplied by columns-1)
Statistical test conclusions for chi square scores
-Reject null hypothesis if X^2 observed> X^2 critical or is p<alpha
Fail to reject the null hypothesis if X^2 observed< X^2 critical or is p>alpha
Reports for chi square tests should include:
Name of Test
Degrees of freedom
Total count in the observed table
The observed chi-squared value (two decimal places)
P-value (three decimal places)
Write out the formula for the t-observed for correlation
Refer to formula sheet
Reporting of correlation test should include
Symbol for the test
Degrees of freedom
Observed correlation value
P-value (three decimal places)
Correlation
Evaluate the association between two numerical variables (looking for a pattern)
No implied causation between the variables
Both variables are assumed to have variation
Not used for prediction
Pearson’s correlation coefficient (r, p) measures the strength of the association
Write out the formula for r
refer to formula sheet
Difference between correlation and linear regression
Correlation cannot predict, linear regression can (experimental studies)