Stats Exam 4 Flashcards
Chi Square Null
Ho: There is no relationship between the two categorical variables.
Alternative
Chi Square Alternative
Ha: There is a relationship between the two categorical variables.
Assumptions of The Chi-Square Test for Independence
- The sample should be random.
- In general, the larger the sample, the more accurate and reliable the test results are. All expected counts need to be greater than 1, with at least 80% exceeding 5 to ensure reliable use of the test. Note: this rule applies only to expected frequencies. It is acceptable for an observed frequency to be 0, provided the expected frequencies meet the criterion
Linear correlations
-Have two components: direction & size
-Both described by “r”(sample) or “ρ” (rho, population)
r = Pearson’s Correlation Coefficient
Properties of linear correlation coefficient r
- Range: -1 ≤ r ≤ 1
Scale is irrelevant (based on standardized scores)
Only measures strength of linear associations
DOES NOT IMPLY CAUSALITY
r^2
r2 = proportion of the variation in y that is determined by x
interpreting r
0.5 0.9 : correlation is very strong
r = ±1.00 : correlation is perfect
Is the given r value statistically significant?
A weak correlation (small r) can be significant.
A moderate/large correlation can occur by chance alone and be statistically insignificant.
If r is NOT significant…
the best predictor of x is x_
the best predictor of y is y_
Regression line
= A “best fit line”, y = mx + b.
Residuals
variation not explained by the regression model
Least Squares Property
Linear regression produces the smallest possible sum of squares for residuals.
S.O.S. Residuals= Unexplained Variation
If no significant correlation exists, the best estimate of Y is
the MEAN of Y
F statistic
Mean Square Regression / Means Square Residual
F Test for Regression
Tells us if the regression model is statistically significant.
Multiple Regression
Bivariate regression can be extended to multivariate data
-When 2 or more independent variables may be related to a dependent variable
Advantages
- Improved predictive value (r square)
- Estimates are more precise