Exam revision questions Flashcards
What are the assumptions for t-tests, chi-squared tests, ANOVA and regression?
What is the effect size measure for a chi-squared test of independence?
Cramer’s V.
what is the difference between a one-sample and two-samples (independent and paired) t-test?
A one-sample t-test is comparing a sample mean to a value, such as a known population mean or a specific value.
A two-samples t-test is when we are comparing to sample means.
If the assumption of normality of variables is violated, what can you do if you wanted to do a one-samples t-test originally?
You can do a Wilcoxon test. Wilcoxon tests compare data by ranks and not actual values.
What does a p-value represent?
Can we say the null hypothesis is true or false?
A p-value tells us the probability, if the null hypothesis is true, of observing a test stat at least as extreme as ours.
We cannot claim whether the null hypothesis or the alternative hypothesis are true based on p-values. We can only draw inferences about how likely it is that either are true.
What is the standard deviation of the sampling distribution of the mean?
The standard error of the mean, or SEM = sample sd/square root of the sample size.
What factors influence power?
What is power?
Power is when we reject the null hypothesis and the null hypothesis is actually false. It measured by 1-beta, where beta is the type II error rate.
Power is dependent on sample size, alpha and effect size.
Can you say that p is the probability that the null hypothesis is true?
No!
The p-value is how likely you are to see your data IF the null was true.
What are the assumptions of a chi-squared test?
- Large frequencies - at least 5 in each cell.
If violated use Fischer Exact Test. - Independence of data - no one contributed for than one piece of data.
If violated used McNemar test.
What is the common effect size used when doing chi-squared tests?
Cramer’s V.
The higher Cramer’s V, the more likely the variables are associated, not independent.
Think chi-square test for association or independence.
By definition do all z-scores have a mean of 0 and standard deviation of 1?
Yes.
I don’t quite understand what that means.
How do we come up with the t-distribution? Even if we know the population mean we do not know the standard deviation…
The t-distribution is created by taking the population mean then averaging over lots of different possible population sds. As N increases, we become more accurate at predicting population sd and the t-distribution becomes more normal and tighter.
As the t-distribution is dependent on the degrees of freedom, is it true that whether a t-statistic is significant depends on the sample size?
Yes.
A given value for a t-stat may be significant for a sample of 20, but not for a sammple of 10.
Do the degrees of freedom used in a Welch t-test of independence take into account just how different the variance is within each groups/sample?
Yes.
What type of t-test assumes homogeneity of variance?
What type of t-test takes into account the different standard deviations of the samples?
Student independent samples t-test.
Welch independent samples t-test.
What are the assumptions made for an independent samples t-test?
- Samples distributions are normal.
- Data are independent.
- Variance of samples are the same if using a student independent samples t-test. If violated, which it normally is, then use Welch independent samples t-test).
When do we use a paired-samples t-test?
When we are interested in the difference scores, not just whether two means ARE different.
Examples would be in repeated measures designs, such as pre- and post-treatment. Also, if the there is a common object in each group, such as two people giving ratings for the same set of hats and we wanted to know whether their mean rating of the hats overall differed.
One of the assumptions of all t-tests is that the data in each group are normally distributed. How do we test this both qualitatively and quantitatively?
QQ plots can be used to qualitatively check this.
Shapiro-Wilk tests can be used to quantitatively assess this.
What does a Shapiro-Wilk test with a W less than 1 and a p < .05 suggest.
That the data are not normally distributed.
What is the null hypothesis for a Shapiro-Wilk test?
That the data are distributed normally.
Is it true that Shapiro-Wilk tests will often be significant, i.e. imply non-normality, even if the data are normally distributed?
What can we do check normality if we have a large sample size and Shapiro-wilk is coming significant?
Yes.
If sample size is over 40-50 then Shapiro-Wilk test likely to be significant.
We can look at QQ plots and histograms to assess whether there is a normal distribution of data.
For an independet samples t-test you need to check the normality of each group. When checking the assumption of normalcy for paired-samples t-test, what group/s are we testing?
The difference variable. This is because the test we are doing is essentially a one samples t test on the difference variable.
What are non-parametric tests?
Statistical tests that do not make assumptions about the distribution of data.
What are some limitations with non-parametric tests?
They are not as powerful, i.e. they have higher type two error rates.
If the assumption of normality was violated, what is a non-parametric test we can use instead of a t-test?
A Wilcoxon test.
The test stat is W.
Essentially it measures how many times values from one group are larger than the other.
The effect size used for t-tests that meet assumption of normality is Cohen’s d. However, this effect size relies on normal distribution of data. What effect size can we use if we do a non-parametric test, such as a Wilcoxon?
Wilcoxon effect size r, which has a similar interpretation as Cohen’s d.
What influences the size of the t-statistic?
- How different the means are (obviously).
- The degrees of freedom/ sample size.
- The variance of the two sample. Increased variance decreases t.
What is the difference between a one-way ANOVA and a two-way ANOVA?
In an ANOVA, does the sum of squares between measure the squared difference between each group mean and the grand mean, taking into account the sample size of each group?
Yes.
Why is the total variability in an ANOVA, SSb + SSw, not our test stat?
Because this does not tell us whether there are multiple populations or not. We are interested in the relationshiop between the variability within groups and variability between groups. This will give us an indication as to whether we are observing multiple populations.
Hence, the test stat for ANOVA takes into account the ratio between SSb and SSw.
In an ANOVA, what are the degrees of freedom for the between groups variability?
What are the degrees of freedom for the within groups variability?
G-1, where G is number of groups.
N-G, where N is combined sample size and G is number of groups.
What is the test statistic for ANOVA?
F=MSb/MSw, where MSb=SSb/G-1, MSw=SSw/N-G.
What does a larger F indicate in an ANOVA?
That the means of the different groups are likely to be significantly different.
F is larger when the between groups variation is larger than the within groups variation.
What would we expect the F stat in an ANOVA to look like if the null hypothesis was true?
Small. If the null were true then the means of the different groups would be similar and the between groups sums of squares would be low. Even if SSw was low the F stat would still be low.
What is the effect size for ANOVA?
What does an eta-squared value of one indicate?
What does an eta-squared value of zero indicate?
This means that the between groups variance explains the total variance, i.e. they are the same, and therefore know which group something in is all you need to know to know its value.
That the total variance arises completely from within group variance and therefore it is unlikely we are looking at different groups and knowing which group something is in tells us nothing of the value of that entity.
Are eta-squared values from zero to one?
Yes.
What does eta-squared tell us?
The proportion of the total variance explained by the grouping variable.
e.g. an eta-squared of 0.5 suggests that 50% of the variance in the dependent or outcome variable is explained by the predictor variable or grouping variable.
What is a family-wise type I error rate?
What do we want this to be?
The type I error rate associated with multiple tests, such as multiple t-test done when doing an ANOVA. In other words, it is the probability of obtaining at least one type I error across multiple tests.
We want the family-wise type I error rate to be 5%.
What are some ways we can adjust our p-values for individual tests when we are doing multiple that are contributing to the same analysis?
Bonferonni correction.
Holm correction.
What are some limitations with the Bonferonni correction?
Bonferonni correction is done by multiply the number of tests by the p-values of each test.
This is a very conservative approach and leads to a large loss of power and potentially interesting and important information, i.e. high type II error rate.
What is the Holm correction and why is it a preferable correction to use when doing multiple tests than Bonferroni correction?
The Holm correction has the same type I error rate, but lower type II error rate.
It works by multiplying the lowest p-value by the number of tests then the next lowest p-value by number of tests-1 and so on until it gets to a p-value it cannot reject.