chi-squared and t- test Flashcards
what is the chi- square test?
- test of difference among categorical (nominal/ ordinal) variables
what are the two types of chi- squared tests?
- chi- square goodness- of- fit test
- chi- square test of association
what does the chi- square goodness- of- fit test?
- tests proportions with more than two levels
- how the proportions in data fit to fixed (expected) proportions
what are binomial tests limited to? how does this differ to chi- square?
- binomial test limited to dichotomous variables (heads/ tails, success/ fail)
- chi- square test can test more than two categories
what is the null hypothesis of a dice in chi- square goodness of fit test?
- the dice is fair i.e., each face (1-6) has 1/6 probability
what is Benford’s law?
- frequency of first digits of naturally occuring numerical data (prices, populations, lengths, etc) follow a particular proportion
what does chi- square test for Benford’s law test?
- tests whether the frequency of first- digits of the data follow the known proportion
what is the null hypothesis of Benford’s law?
- Benford’s law is persevered
i.e., numbers are naturally occurring
what happens if the null hypothesis of Benford’s law is rejected?
- it is likely that the data set is fabricated
what is Benford’s law used in?
- various fraud detection scenarios
e.g., accounting, election, and scientific reports
how do you report results of chi- square goodness- of - fit test?
- explain the experiment
- X2 value for df (degree of freedom)
- p value
what does the x2 value show?
- bigger x2= bigger difference
what does the chi- square test of association test?
- compares proportions across two or more groups
- how proportions of two data sets are associated (test of independence)
what variables does chi- square test of association check association between?
- two nominal/ ordinal variables
how are descriptive tendencies for chi-square test of association summarised?
- summarised into a contingency table
how is chi- square test of association reported?
- reported by chi- square value with df and N (number of samples)
- followed by p- value
what do the one sample, independent and paired sample t- tests correspond to?
- corresponds to the test for nominal/ ordinal variables
what test does one sample t- test correspond to?
- binomial or chi- square goodness of fit test
what does independent (unpaired) samples t- test correspond to?
- chi- square test of association
what does paired samples t- test correspond to?
- McNemar’s test
what does the one sample t- test compare?
-compares the mean of one sample group against a fixed value
what is the null hypothesis of the one sample t-test?
- the population underlying the sample has the mean equal to the fixed value
what do the two other t- tests compare? what are they?
- two other t- tests compare a measure across two groups
- independent and paired
what does the independent samples t- test compare?
- the observed differences between the means of two independent samples or categories
why is the independent samples t- test called this?
- because the data is from different groups
what is the null hypothesis of the independent samples t- test?
- the population underlying the two samples have equal means
what are paired samples?
- means the data points are paired across two groups
what is the test for paired samples? what is it available for?
- McNamar’s test
- only available for two dichotomous variables i.e., 2 by 2 contingency table
what does the paired sample t- test compare?
- compares the main difference of one group measure on two occasions
what is the null hypothesis of the paired sample t-test?
- the population mean did not change
what does the student’s t- test compare?
- compare means of populations (three or more means we use a different test)
what does student’s t- test show difference in?
- difference in group of measures
( interval or ratio variables)
what is the null hypothesis of the student’s t- test?
- null hypothesis is that means are equal
what is the main assumption of the student’s t-test? what does it mean?
- normality
- sampling distribution of the mean is normal- if you take groups of n- samples from the distribution and calculate the means of each sample group those means are normally distributed
when does the main assumption for student’s t- test hold?
- holds when the sample size n is sufficiently large
what is the theorem that describes the main assumption for student’s t- test?
- central limit theorem
what are statistical tests based on the normality called?
- parametric tests
what should we not always assume in statistical tests?
- shouldn’t assume normality
can the normality assumption be checked?
- yes
- using another stats test
- test of normality e.g., Shapiro- Wilk test
what is violation of the normality indicated by?
- low p- value
i.e., p < 0.05
what other tests are there relating to the assumptions?
- non- parametric tests
- don’t require the normality assumption
what happens when the test of normality fails?
- alternative choices
what is the assumption of independent samples t- test?
- equality of variance
- homogeneity of variance
- variance of two populations are equal
how is the assumption of independent samples t- test tested?
- tested by Levene’s test of equal variance
how is the significance of difference in variance reported? what is equal and what isn’t?
- reported as p- value
- p > 0.05= variance equal
- p <0.05 = variance not equal
what happens if variance isn’t equal?
- theres another test
- Welch’s t- test
what are t- tests based on?
- based on t- statistic
what are T- statistics?
- they are like the z- score but about the mean and SD of the sample
- not the population
what does the T- value depend on?
- depends on the degree of freedom
how is df worked out?
= sample size- number of groups
what does a higher t value mean?
- greater difference
what is the t- value usually reported with?
- usually with descriptive statistics
- Mean and standard deviation
what does ANOVA compare?
- compares a measure across more than two groups