L4 - Nonparametrics Flashcards
When we compare means, what type of data do we use?
Metric data (numeric)
What is a parametric test?
Parametric tests assume a normal distribution of values, or a “bell-shaped curve.”
What is a non-parametric test?
A non parametric test (sometimes called a distribution free test) does not assume anything about the underlying distribution
(for example, that the data comes from a normal distribution).
It usually means that you know the population data does not have a normal distribution.
“Analysing the difference between indigenous and non-indigenous smokers”
What type of data is this?
Categorical
How many people are using services, how many people do “x”.
Count type data - difference between two categories
What is a binomial test useful for?
Useful for tests with binary outcomes
What is a “proportion difference test”?
T-tests for proportions.
E.g. are men more likely to smoke than women
Are men more likely to report abuse than women etc
What is a “t-test”?
A statistical test to see whether there is a significant difference between two means.
What do non-parametric tests for related samples analyse (e.g. McNemar’s test and Cochran’s Q)?
Whether the probability of a binary outcome changes over time
e.g. pass or fail test; 40% fail on first try; only 20% fail on second try
What is a chi-square test used for?
A chi-square test is used to determine whether there is a statistically significant difference between the expected frequencies and the observed frequencies in one or more categories of a contingency table.
What are the two types of chi-square tests?
- Goodness of fit test (or one-way test - 1 independent variable)
- Contingency table/cross-tab (or two way analysis - 2 independent variables)
What does a ‘Goodness of Fit Test’ analyse?
Tests whether how well an observed distribution of scores matches a distribution expected by chance
Why it’s called a “goodness of fit test” - how good does the IV fit the distribution expected by chance
How many IV’s does a goodness of fit test have?
1
What does a distribution expected by chance look like (in a bar-graph)?
A rectangle
each variable is the same
Which test would you use to test - “8 starting gates on a race-track: are there more wins in some starting gates?”
Why?
Goodness of fit test.
There is only one IV (starting gates) and you are looking to see if there is any difference from chance.
If you observe three vending machines, and you observe 120 people. If based on chance, how many people would use one of those vending machines?
40
120/3 = 40
Explain the “chi-squared calculation” χ2 = Σ ( O – E )2 / E
Sum of [observed - expected frequency per level]; square the differences; and divide by the expected frequency. This is the Chi-Squared value χ2
χ2 = Σ ( O – E )2 / E
How do you know if a “chi-squared calculation” (χ2) is significant?
It must be greater than the critical value
If your value is greater than the critical value, it is significant
What is the critical value based on?
- The probability level you want to test at
- Degrees of freedom
In a goodness of fit test, what is degrees of freedom?
The number of levels (variables) minus 1
Df if 3 variables = 3-1 = 2
If the chi-squared calculation in a “goodness of fit test” is greater than the critical value, how would we interpret the results?
The observed distribution of scores differs significantly from what would be expected by chance.
E.g. some lanes are used disproportionately more than others in a way that differs from chance
What does a 2-way chi-squared test look at?
This is the ‘familiar test’ where one tries to determine whether there is an ASSOCIATION between 2 categorical /ordinal variables
It is a test of independence.
Are the variables independent (no effect) or interdependent (related/ an effect)
A 2-way chi-squared test is trying to determine if two variables are _____ or _____ of one another.
Independent (no effect on each other) or Interdependent (related to each other)
What test would we use to determine; “Is the person’s SES and ethnic group related?”
2-way chi-squared test
Is SES and ethnic groups independent or interdependent variables?
Other examples - “are smokers more likely to be drinkers” etc.
What is the 2-way chi-squared test formula?
[Row-total / Total N] * Column total
E.g. for Mach 1; [row total (180) - grand total (290)] times Mach 1 column total (90)
How is degrees of freedom calculated for a 2-way chi-squared test?
Df = (rows – 1)*(columns - 1)
For above example; 2 rows minus 1 = 1; 3 columns minus 1 = 2; 1*2 = Df = 2