W4 - Analysis of Count and Proportion Data Flashcards
What are the two types of Chi-Squared Tests?
Goodness of fit or “one-way” (1-independent variable)
Conteingency table/cross-tab or 2-way analyses (2 indepedent variables)
What is does a “goodness of fit” test measure?
How well an observed distribution of scores matches a distribution expected by chance.
hence the name “goodness of fit”
How would you conduct a “goodness of fit” test?
Obtain observed frequences for each level of the IV
Compare it to what a chance outcome would be.
What shape does a distribution based on chance look like?
Rectangle
These tests are an example of what type of test?
Goodness of Fit
If we are conducting a goodness of fit test on 3 vending machines and we observe 120 people, how many people should go each vending machine if the distribution was chance?
40 each
120/3
What is distributional similarity
How similar one distribution is to another
e.g. in a Goodness of Fit test, how well does the observe data match the chance data
What is the forumla for a Chi-Squared Test?
χ2 = Σ ( O – E )2 / E
Sum of [observed - expected frequency per level] / expected frequency
0 = Observed count per level
E = Expected count per level
What is the symbol that we use for the “obtained test value” in a Chi-Squared Test?
X2
How do we know if a Chi-Squared obtained test value is significant?
Compare the attained value with the critical value
If attained is greater than critical then it is significant
Critical value has something to do with DOF @ 20 mins in lecture if want to rewatch but doesn’t seem important
What is the degrees of freedom in a Goodness of Fit test?
The number of variables minus 1
How should you interpret a Goodness of Fit (or one-way) test?
Look for if the observed distribution of scores differs significantly from what would be expected by chance.
Are there any frequences which occur more than others?
What does a 2-way chi-squared test aim to accomplish?
To determine whether there is an ASSOCIATION between 2 categorical / ordinal variables.
It is a test of independence
What is meant by “a 2-way chi-squared test is a test of independence”?
Are the variables independent (no effect) or interdependent (related/ an effect)
Which test do we use to compare whether 2 variables are independent of one another (a familiar test)?
2-way chi-squared test
To test these hypothesis, what test should you conduct?
2-way chi-squared test
“familiar” or “independence” test
What is the formula for calculating expected frequencies for a 2-way chi-squared test?
[Row-total / Total N] * column total
In the example
For “Men Mach 1” = (180 /290) * 90 = expected frequency at40 square
For “Women Mach 2” = (110 / 290) * 80 = expected frequency at 20 square
What is the critical value for a 2-way chi-squared test?
The (number of rows minus 1) times the (number of columns take 1)
i.e. The observed chi-sq value is compared
with the critical value for
Df = (r – 1). (c-1) = 1.2 = 2
Critical value = 5.99
Your value is 18.69
The two variables are associated or
NOT independent
How do you interpret if the chi-squared value (X2) is significant?
If the critical value is lower than the observed value
example
Critical value = 5.99
Your value is 18.69
The two variables are associated or
NOT independent
What is the minimum expected frequency (MEF) that is allowed in order to have use a chi-squared test?
minimum expected frequency (MEF) assumption
No more than 25% of cells can have a minimum expected frequency of <5
SPSS will give you an error if you try to compute with more than 25% with less than 5 expected frequency
What test should you use if the minimum expected frequency (MEF) assumption for a chi-squared test is violated?
Kolmogorov-Smirnov Test
For a chi-squared test, what effect size measure do we use?
Effect size is given by: Kramer’s V
With the exception of a 2x2 chi-squared
What are “standardised residules” in chi-squared tests and what are they used for?
A standardised measure of the difference between the observed frequency and the expected frequency
They can see how big the difference between the two groups are in chi-squared tests
Values of 2+ are large
When should you use a Fisher Exact Test?
When 2 x 2 chi-sq is invalid due to violation of minimum expected frequencies (MEF) assumption
When you have really small frequencies
How do you report a Fisher Exact Test?
Reported as: “Fisher Exact, p < .05”
Interpretation same as Chi-sq
What are binomial tests used for?
To test for the probability of binary outcomes.
e. g. left vs right turns, good vs. bad in a batch, left v right handers etc.
* Does the probability obtained for one event (e.g., Right) differ from what is expected by chance?*
What symbol do we usually use in a binomial test?
Z-approximation or Z-test
What is the calculation for a Z-test?
Z = X – Mean / SE
X = observations of binary outcome
M = Number expected by chance
SE = standard error = √n.p.(1-p),
where p = probability of the event
(n = number of trials)
What type of test is a proportion difference test?
Parametric Test
Describe what a Proportion Difference test examines?
Examines whether two samples are drawn from populations with the same proportions of a given characteristic
like a t-test for proportions
These questions should be investigated using what type of test?
Proportion Difference (PD) test
Why would you use a PD test instead of a Chi-sq test?
1) Easier to interpret: you get a Z score and the test statistic (difference between proportion is clear)
2) Can calculate a confidence interval
How do you calculate a proportion difference (PD) test?
Following from image
P’ = the overall probability of the event
or quality (e.g., smoking in the whole
sample)
If E1 = instances in sample 1 (n1) and
E 2 = instances in sample 2 (n2)
Then p’ = [E1+E2] / [n1 + n2]
What is a Kappa test used for?
Used to examine inter-rater reliability in situations involving 2 independent raters
Which test should you use to test these two examples?
Example 1: 2 clinical psychologists are asked to assess 50 clients as having or not having clinical depression
Example 2: 30 job applicants are rated for suitability by 2 different interviewers
Kappa test
What is the formula for calculating Kappa tests?
What symbol do we use when reporting a Kappa test?
What value is considered very good, good, fair and then not good?
K
Very good => 70
Good = 50
Fair = 30
Not good = Less than 30
What tests can we use for Repeated Measure in Catageorical Data Analysis
McNemar Test:
examines whether the probability of a given event is more or less likely in the same sample over time
What does Cochran’s Q measure?