AP Stat Ch 11-12 Flashcards
Chi squared test statistic
Chi squared = sum of (observed count - expected)^2 / expected
Calculate it on the calculator in the test menu or by making an L3 and then adding the sum of the list
Chi square goodness of fit test
Allows us to determine whether a specific population distribution seems valid. 1 variable in 1 population.
GOF degrees of freedom calculation
# of categories - 1 So if there are seven M and M colors, then df = 6
Chi squared GOF test steps
Used for a problem with one variable, one population. Comparing the observed to the expected and see if the actual proportions are equal to the hypothesized proportions.
1. Chi squared GOF test. Alpha = .05 for p1 = true proportion of … (red M and Ms for example), p2=… , …
Ho: p1 = …, p2=…, …
Ha: Ho isn’t true
- Conditions
A. Random
B. Sample size is large. Need to show the expected cell counts are all greater than or equal to five. If not true, then combine caterogires or proceed with caution
C. Independent. Need 10 times population to the sample - Calculate:
P(chi squared > …) with df = # caterogries -1.
Calculate this by going to the stat tests, chi squared GOF test, have observed in one list and expected in the other and it’ll calculate for you. - If p is smaller than alpha, reject Ho and conclude that the proportions are different. If p is bigger, then fail to reject and cannot conclude proportions are different.
Chi squared test for homogeneity of proportions (HOP)
Here, we are looking at one variable in 2 or more populations.
Ask the same question to two or more different populations.
For example, say we want to see if girls prefer different subjects than boys, then we ask the same question to the girls and the boys, different populations here.
Degrees of freedom in HOP test
(Rows-1)(columns-1)
Four step process for HOP test
- Chi squared HOP test for alpha = …
Ho: the proportions of interest are the same for all populations (in context)
Ha: they’re different - Conditions:
A. SRS
B. Sample size is large. Expected cell counts need to be greater than or equal to five. Need to show this–> do this by entering our observed values into a matrix, then doing chi squared test and putting expected into different matrix and circling back and looking at the end
C. Independence (10 times the sample) - Calculate:
Do the chi squared test on the menu with observed and expected matrices. P(chi squared >…)=…
Df = (r-1)(c-1) - Decision and conclusion in context. If p is less than alpha, reject Ho and conclude the proportions of interest are not the same for all populations.
If p is bigger than alpha, fail to reject Ho and cannot conclude the proportions of interest are not the same for all populations.
Chi squared test for association/indep
2 variables in one population–>
Designed to examine the association between two variables in a single population. One sample and ask two questions
Difference between GOP, HOP, IND
GOF: 1 variable, 1 population. Ex: take one sample and ask one question
HOP: 1 variable, 2 or more populations Ex: take 2 or more samples and ask 1 question
Ind: 2 variables in 1 population. Ex: take 1 sample and ask 2 questions.
How to calculate expected cell count:
Column total * row total / grand total
4 step process for chi squared IND test
- Chi squared for independence
Alpha = …
Ho: there is NO association between _ and _. Or ___ and ___ ARE independent.
Ha: there IS and association. Or they are not indep. - Conditions:
A. SRS
B. Sample size Is large– check expected cell counts by doing chi squared test and looking at matrix B
C. Indep (10 times sample) - Calculate: p(chi squared >…) with df = (r-1)(c-1)
Calculate the same way as HOP test - Conclude:
If p < alpha, reject Ho and conclude that there is an association.
If p> alpha, fail to reject Ho and cannot conclude there is an association.
R squared
The percent of the variability in y that can be attributed to the X variable.
R
SQRT of r squared
Positive or negative–> can tell based on sign of the slope
Confidence intervals for the regression slope
Confidence interval for the slope–>
b+/- t* SEb where t* has df=n-2 and SEb is given on the table next to the X variable under stdev
On the table, the coef constant will be a in the form a+ bx
The X coeficient will be the slope statistic, b
The X stdev will be what we multiply t* by
Get t* from tcdf or the table. Round down if don’t have the exact number for DF.
Then calculate as such.
If zero is not in the interval, conclude there is a linear relation. If zero is in interval, can’t conclude there is a relation.
P and t value from minitab output
T value next to the X variable is for the significance test. P value next to that is for a 2 sided test. Divide by 2 if one sided.
Testing the hypothesis of no linear relationship (performing a significance test for the slope)
Ho: Beta = 0 (no relation)
Ha: either Beta is bigger, smaller, or not equal to 0
GO to stat, test, linRegTtest
Df = n-2
if p< alpha, reject Ho and conclude there is a relationship.
If p> alpha’ fail to reject Ho and cannot conclude there is a relationship.