Chi-Square Test of Association Flashcards
What is the probability of an event always between?
0 & 1
What does a p-value of 0 mean?
There’s no chance that the event will occur.
What does a p-value of 1 mean?
The event will always occur
What does P(E) = 0.5 mean?
The probability of event (E) is 0.5/ 50%.
Why do we use p-values (from a statistical test)?
To decide whether/ not to reject the null hypothesis.
What do p-values represent?
The theoretical probability of obtaining our data when the null hypothesis is correct.
What do inferential statistical tests spit out?
P-values (& other valuable information)
When do you use the Chi-Square Test of Association?
When testing for the presence of a relationship between 2 categorical variables & when you have a between-subjects (unrelated-subjects) design.
What type of test is the Chi-Square Test of Association?
An inferential test
When do you use the t-test?
When testing for a difference between the means of 2 numerical variables, when you have a between-subjects (unrelated-subjects) design, or a within-subjects (related-subjects) design
What type of test is the t-test?
An inferential test
What are 2 examples of experimental hypotheses (HE)?
“Medical outcomes differ by treatment centre” & “there’s a relationship between age & playing video games (N=100; 50 adults)”
What is an example of a null hypothesis (H0)?
Medical outcomes don’t differ by treatment centre.
How do we test that the results of a study are compatible with the null hypothesis being true?
By testing the probability that the results of the study are compatible with the null hypothesis being true
How do we come up with the theory behind a lab report?
By developing a novel hypothesis
How do we identify questions when planning a lab report?
By designing a study to test your hypothesis
How do we come up with the experimental design behind a lab report?
By applying for ethical approval.
How do we incorporate statistics into lab reports?
By performing a study, collecting data & analysing the data.
What needs to be presented & evaluated when we write lab reports?
Our findings
What do we need to have to perform a study & gather statistical data?
An IV & a DV
What is an example of an IV?
Type of university
What is an example of a DV?
Whether a group survives or dies.
How do we know the level of measurement of data sets?
By designing studies to test hypotheses, applying for ethical approval, then performing the studies & collecting & analysing the data.
In a study looking at whether type of hospital, out of 4 different types, affects survival rates, should the proportion of people surviving be similar or different across all 4 locations if the type of hospital makes no difference to the outcome?
Similar
In a study looking at whether type of hospital, out of 4 different types, affects survival rates, if all 4 treatment centres are drawn from the same population, are any differences we see due to chance or due to causation?
Due to chance
What are 2 examples of categorical data?
4 different types of treatment centres & whether patients die or survive.
When can we not compare mean differences between data sets?
When we have categorical data
What can we do when we can’t compare mean differences between data sets?
Compare portions & decide whether/ not the data collected is compatible with the null hypothesis being true
What is an example of a binary outcome?
Whether patients die or survive.
In a study looking at whether type of hospital, out of 4 different types, affects survival rates, what do we do as we can’t compare mean differences between the data sets as each data set contains categorical data?
Compare portions & decide whether/ not the data collected is compatible with the null hypothesis being true (i.e., that there’s no systematic difference in survival rates across hospitals).
How can we calculate the probability of observing the pattern of results obtained if the null hypothesis is true?
Using inferential statistics
How do we decide whether/ not the data collected from 2 or more portions is compatible with the null hypothesis being true?
By using p-values
When do we reject the null hypothesis?
If the probability of there being a difference between observed values (results obtained) & what we would expect by chance (if the null hypothesis were true and there is no relationship between sets of data) is low (the p-value is small).
When do we decide that it’s unlikely that we would’ve observed a pattern of results by chance alone?
If the probability of there being a difference between observed values (results obtained) & what we would expect by chance (if the null hypothesis were true and there is no relationship between sets of data) is low (the p-value is small).
When can we conclude that there’s a significant relationship between 2 values (e.g. treatment centre & mortality)?
If we decide that it’s unlikely that we would’ve observed a pattern of results by chance alone.
What do we consider to be a low probability?
p <= 0.05
When do we fail to reject a null hypothesis?
If the probability of there being a difference between observed values (results obtained) & what we would expect by chance (if the null hypothesis were true and there is no relationship between sets of data) is high (the p-value is large).
When do we decide that it’s likely that we would’ve observed a pattern of results by chance alone?
If the probability of there being a difference between observed values (results obtained) & what we would expect by chance (if the null hypothesis were true and there is no relationship between sets of data) is high (the p-value is large).
When can we conclude that there’s no effect of a relationship between 2 values (e.g. treatment centre & mortality)?
If we decide that it’s likely that we would’ve observed a pattern of results by chance alone.
What do we consider to be a high probability?
p > 0.05
What is the procedure for testing hypotheses when data are categorical (or nominal)?
Carrying out a Chi-Square Test of Association
When do we use a Chi-Square Test of Association?
When we want to ask questions about the relationship between categorical variables.
What are examples of questions about the relationship between categorical variables?
“Is there a relationship between life stage & wanting to be a psychologist?”, “does the incidence of mental health disorders differ by country?”, & “is there a relationship between BRCA1/2 test results & clinical decision making?”
What are categorical or nominal data?
Named (or sometimes numbered) discrete categories (e.g. religion, marital status, disease, & job field)
How is nominal (or categorical) data produced?
By counting the number of observations in each of multiple categories
What are examples of named discrete categories of gender identity?
“M”, “F”, “T”, “NB”, & “Ot”
What kind of data has no kind of intrinsic/ meaningful order?
Categorical/ nominal data
What are the basic principles of the Chi-Square Test of Association?
It must be used to examine the relationship between 2 variables & establish the probability that group membership by 2 variables occurs by chance, & a standard probability threshold must be applied to test our hypothesis & conclude significance.
How are observed (collected) data for the Chi-Square Test of Association organised?
In a contingency table
How do we know what we would expect by chance?
By estimating it from collected data
What is the formula for expected frequencies per cell when carrying out the Chi-Square Test of Association?
(Row total x column total)/ grand total
What does the Chi-Square Test of Association test?
Whether there’s a deviation (or difference) between observed & expected values.
What do we need a way to measure with observed & expected values?
The overall degree to which the observed & expected values differ from each other.
What does the Chi-Square Test of Association test?
How closely our observations fit the expected model.
What is the formula of the Chi-Square Test of Association?
χ2 = Σ(O - E)2/ E, where E = expected frequency, O = observed frequency, & Σ = the sum of
How do you carry out the formula for the Chi-Square Test of Association?
By summing together (Σ) the squared deviations of each observation from their expected values, & dividing each by their expected values.
When carrying out the formula for the Chi-Square Test of Association, do you carry out the division first or the addition first?
The division
Logically, if the null hypothesis were true (p > 0.05), would the observed & expected frequencies of 2 categorical variables be similar or different?
Similar
Logically, if the null hypothesis were true (p > 0.05), would the calculated value of χ2 be large or small?
Small
If the observed frequency of adolescents playing video games were 35, and the expected frequency was 27.5, what would the χ2 statistic be?
(35 - 27.5)2/ 27.5 = 2.04
If the observed frequency of adults playing video games were 20, and the expected frequency was 27.5, what would the χ2 statistic be?
(20 - 27.5)2/ 27.5 = 2.04
If the observed frequency of adolescents not playing video games were 15, and the expected frequency was 22.5, what would the χ2 statistic be?
(15 - 22.5)2/ 22.5 = 2.50
If the observed frequency of adults not playing video games were 30, and the expected frequency was 22.5, what would the χ2 statistic be?
(30 - 22.5)2/ 22.5 = 2.50
If (O - E)2/ E for each of 4 categories was: 2.04, 2.5, 2.04, & 2.5 what would the χ2 statistic be?
2.04 + 2.5 + 2.04 + 2.5 = 9.08
What are the 3 steps involved in the Chi-Square Test of Association?
Extracting/ computing information, calculating the Chi-Square statistic, & converting the Chi-Square statistic to a probability
How do we convert a Chi-Square statistic into a probability?
By referring to the table of critical values of χ2.
To what do we need to refer to determine whether a result is significant (i.e. if we can reject the null hypothesis)?
The table of critical values of χ2.
If observed frequencies of categorical data are random (& the null hypothesis is true), what will be (or close to) 0?
The χ2 value
If observed frequencies of categorical data are very different from those expected by the null hypothesis, what will be large?
The χ2 value
When converting a χ2 statistic into a probability, which column of the critical χ2 values table do we look at?
The p-value threshold of 0.050
What do we need to calculate when we perform a Chi-Square Test of Association?
A degrees of freedom (d.f)
How is d.f calculated for a Chi-Square Test of Association?
d.f = (C - 1)(R - 1), where C = number of columns in a contingency table, & R = number of rows in the contingency table
What is the d.f of a contingency table with 2 rows & 2 columns?
(2 - 1)(2 - 1) = 1
What is the critical value of Chi-Square at p <= 0.05 & d.f = 1?
3.842
When performing a Chi-Square Test of Association, when can we reject the null hypothesis?
If the χ2 value is larger than its relevant critical value.
How would you report the results of a Chi-Square Test of Association if the χ2 value was 9.08, the d.f was 1, N was 100, & the p-value was less than 0.05?
χ2 = 9.08, d.f = 1, significance level: p < 0.05
Given that the critical value of the following results: χ2 = 9.08, d.f = 1, significance level: p < 0.05, is 3.842, does our calculated χ2 value exceed or fall short of the critical value?
It exceeds the critical value
If our calculated χ2 value exceeds our critical value, do we reject or fail to reject the null hypothesis?
We reject it
What are the 4 steps involved in calculating the Chi-Square Test of Association by hand?
- Compute the Chi-Square statistic (χ2 value)
- Establish the critical value of Chi-Square for a probability (p-value) of 0.05 & your degrees of freedom.
- If your calculated χ2 value is equal to/ greater than the critical value, you can reject the null hypothesis & conclude that you have found a significant effect.
- If your calculated χ2 value is less than the critical value, you fail to reject the null hypothesis
How do you calculate the Chi-Square Test of Association using SPSS?
By inputting the data in the correct format (SPSS will need to know whether you’re putting in raw data/ contingency table values), then telling SPSS to perform a Chi-Square Test of Association, which will cause SPSS to calculate the exact p-value for you (as well as reporting your expedited values & degrees of freedom)
When do you not need to use a Chi-Square critical table of values (& just interpret the reported p-value)?
When using SPSS to perform the Chi-Square Test of Association
What differs by age category (χ2 = 9.08, d.f = 1, significance level: p < 0.05)?
The percentage of people that play computer games.
What is 1 conclusion that could be drawn from a Chi-Square Test of Association being performed to examine the relationship between life stage and playing computer games, & the relation between these variables being significant, χ2 (d.f = 1, N = 100) = 9.08, p <= 0.05?
That adolescents are more likely to play computer games than adults.
When should you use the Chi-Square Test of Association?
When you have nominal/ categorical data, when individual frequencies are independent (i.e. when using a between-subjects design) & when expected frequencies are (ideally) not less than 5.
When does the Chi-Square Test of Association not lend itself well to directional (e.g. uni-directional hypothesis) interpretation?
If a contingency table is larger than 2 x 2 cells.