15- Association and Correlation I Flashcards
What is the chi-squared test of independence?
a non-parametric test based upon an x^2 distribution which can be used to find p-values
Purpose of the chi-squared test of independence?
Used to determine whether there is a significant association between two categorial variables
What is chi-squared distribution?
begins at zero and is positively skewed (no negative values), mean and variance are both equal to the degrees of freedom and then shape of the distribution depends upon the number of degrees of freedom
Contingency Table
displays frequency distribution of two or more categorial variables
Degrees of Freedom
related to the amount of information in the table and is important for calculating the p-value
Assumptions which come with chi-squared test of independence
1) categorical data with observed counts, two or more levels per variable 2) subjects are counted in one and only one cell 3) No category should have an expected frequency less than 1 and not more than one category in five total should have an expected frequency less than 5
Properties of the chi-square test of independence?
1) Tells us about associations between two categorical variables 2) no association = null hypothesis 3) test statistic can be used to find p-values 4) shows significance of association between two categorical variables but not the strength of association
What is Cramer’s V?
A test which measures the strength and magnitude between two categorical variables that have been found to be dependent using the chi-squared test
Result of Cramer’s V?
number tells the size measure for assessing the practical significance of the chi-squared test of association
Bivariate Relationships
Can be positive, negative or neutral
Association strength in Bivariate Relationships
1) Perfect Association- all points are PERFECTLY linear 2) Strong Association- points are NEAR linear 3) Weak Association- points are somewhat linear 4) No Association- points are randomly distributed
Benefits of Bivariate Relationships
1) Visual (scatterplots, bivariate maps) gives us subjective and qualitative insight 2) Correlation provides us with a quantitive, objective measures of the nature of relationships between variables