Hypothesis Testing Flashcards
What does the p value tell us?
The p value tells us the likelihood of obtaining our results by chance (i.e. When our defined null hypothesis is true).
What does a p value of 0.025 mean?
It would tell us that the chance of obtaining our results by chance was 0.025. This is very unlikely and we can use this result to reject the null hypothesis.
What does a small p value tell us?
Indicates that there is a difference (or association) and we can reject the null hypothesis.
What does a large p value indicate?
It indicates that there is no evidence of a difference (or association) and we fail to reject the null hypothesis.
True or false - the p value, the size of the effect and the number of observations (sample size) are all interrelated.
True. If you carry out a small study, you can get a p value which is not significant even when the effect is large. If you carry out a large study, even small differences which are clinically and epidemiologically irrelevant, may achieve statistical significance.
Summarise how to go about hypothesis testing.
1) . Start by specifying the study hypothesis and the null hypothesis (which is usually that there is no difference between the groups).
2) . We assume the null hypothesis is true. That there is no difference between our two groups.
3) . We calculate the chance that we would get the difference that we observed if the null hypothesis were true. This chance is called the p value.
4) . We then accept or reject the null hypothesis on the basis of the size of the p value. If the p value is small, we reject the null hypothesis. If the p value is big, we accept the null hypothesis.
How do we choose what hypothesis test to use in order to obtain the p value?
We use a different test depending on the design of the study (unpaired or paired) and on what sort of outcome variable we are dealing with, continuous or categorical, whether it is normally distributed or not. Each test may have some additional assumptions associated with it, and you should always check to make sure these are valid before carrying out and reporting the test.
What is the most important question to answer when deciding how to analyse continuous variables?
One of the most important questions to answer when deciding how to analyse continuous variables is whether they are following a normal distribution or not, so this should be the first thing you look at. You need to assess the histogram and summary statistics.
What is another name for the Student t-test?
The independent samples t-test.
What is the independent sample t-test used for?
The independent sample t-test (or student t-test) is used when we want to compare two groups and the outcome variable is a continuous normally distributed variable, such as birth weight. The standard t-test also assumes that the variation (scatter) in the two groups is approximately the same.
The t-test provides a p value which is interpreted as previously described. The probability that we are after is the probability of getting the difference in the sample means that we observed when the true difference is zero. Now the probability will depend on how big the difference is between our two sample means, and it will depend on how much (sampling) error there might be in our estimate of the difference in means. We already know that we have a measure of the amount of error in our estimate: this is called the standard error or the mean difference. We used this when we calculated the confidence interval for the difference between two means, but all you need to realise is that, just as for a single mean, this standard error is a formula which depends on the sample sizes and the standard deviation of the thing we are measuring.
The t-test uses the observed difference in the sample means, and the standard error (sampling error) for the difference in means to calculate the p value.
What is a t-statistic?
A t-statistic is the difference in sample means divided by the standard error of the difference in means.
A large difference in sample means and a small standard error will lead to a large t-statistic, indicating that the probability that the observed difference happened by chance is small and producing a small p value.
A small difference in means or a large standard error will produce a small t statistic meaning that the probability that the observed difference happened by chance is large and the p value will be large.
The t statistics can be converted into a p value using statistical tables and the t-distribution, but more commonly using statistical software.
When is it appropriate to use the t-test?
When comparing observations on a continuous variable between two groups the t-test is valid only when the data are normally distributed and when the two populations have equal variances. Because of these assumptions about the underlying distribution, it is known as a parametric test.
What do we do if the assumptions for using a t-test (parametric test) do not hold?
If your data do not appear to be normally distributed, it will be necessary to use a non-parametric test, and to present medians (and difference in medians). Non-parametric tests make no assumptions about the underlying distribution of the data.
What are the advantages and disadvantages of using non-parametric tests?
Non-parametric tests make no underlying assumptions of the data. However, they do have their disadvantages:
- they are less powerful than parametric tests, ie they are less likely to detect a true effect as significant.
- it is not easy to obtain confidence intervals using the non-parametric approach (SPSS does not calculate these; programs such as Minitab will give you an estimated confidence interval for some non-parametric tests.)
Describe the Wilcoxon rank sum tests.
The Wilcoxon rank sum test is the non-parametric equivalent of the independent samples t-test. The Mann-Whitney U test is a. Alternative to the Wixcoxon rank sum test that uses a different formula but results in the same p-value. The theory behind this test is given in the notes for information but you will only be expected to carry out this test using SPSS.