t- test Flashcards
What does a one sample hypothesis consist of?
These relate the mean of a sample to a prespecified comparison value.
E.g. Attendance in class is more than 80%
E.g. Medical doctor’s stress levels are higher than the average stress in the UK
(Null hypothesis states there’s no difference between the two)
What does a hypothesis do?
A hypothesis looks to compare the null hypothesis of no effect with the alternative hypothesis.
These don’t start on equal footing; we assume the null is true until proven otherwise.
Why do we need a null hypothesis?
- We start with the null and put the burden of proof on the alternative hypothesis.
- We know everything about our statistics when there is no effect
There are many ways for an effect to have occurred (alternative hypothesis) but there is only one way an effect cant occur (null hypothesis)
What is a test statistic designed to do?
show the extent to which a data sample is different to the null hypothesis
What’s the calculation for a one sample t- test?
T- value = (Mean observed data) minus (the comparison value) divided by (standard error of the mean observed data)
What’s the comparison value in a t- test?
the difference between our observations and our hypothesis
What is the standard error of the mean in a t- test?
tests how close a sample mean is to the population mean
What does a one sample t- test do?
tests whether the mean of a sample is different to a comparison value
How were tests done before the t- test was introduced?
the answer was to restrict analyses to very large samples where we can be confident in the standard deviation
But this is very impractical
What is the students t- test?
- When we have loads of data what we expect to see when the null is true is a normal distribution
- What the t- test allows us to do is with sample sizes we can adjust that expectation allowing us to account for the extra noise by expecting to see large t- values
- So when the sample size is small we need a bigger t value to see significance
What is a two sample hypothesis?
Independent samples hypotheses = e.g. Football players run 200m faster than rugby players (null would say they are the same/ no difference)
Dependant samples hypotheses = e.g. students attention spans are longer on days with fewer teaching sessions (null would say they are the same/ no difference)
What are independent and dependant samples also known as?
Within subjects design (dependent samples) (paired samples) (repeated measures)
Participants contribute to both conditions
Between subjects design (independent samples)
Each participant is contributing to a single condition
When should you use at-test?
- Comparisons of two group means or a single mean to a reference value
- Data must be interval or ratio type (as a t test needs both an interpretable mean and standard deviation)
- Assumptions must be met
What are the assumptions of a t- test?
- Appropriate data type
- Assumption of normality
- Data observations are independent
- Groups have equal variance
What is the calculation for independent samples t-test?
The difference between the two means of both groups, all divided by the pooled (put together) standard error of that difference
What does a big number at the top of a independent samples t test fraction cause?
The difference is large compared to our confidence in the estimate
What does a big number at the bottom of a independent samples t test fraction cause?
The difference is small compared to our confidence of the estimate
What does a large positive t- value mean in an independent samples?
the mean of group 1 is above the mean of group 2 (around 15 is large)
What does a near to 0 t value mean in independent samples?
the mean of group 1 is indistinguishable from the mean of group 2
What does a large negative t-value mean in an independent samples?
the mean of group 1 is below the mean of group 2
What is homogeneity of variance and what is the test for it?
- The levene’s test = tests for homogeneity of variance
- A significant levene’s test indicates that the groups are likely to have different variances
And therefore pooled estimate of standard deviation is not appropriate
What is the Welches t- test?
Uses an unpooled measure of standard deviation
Valid for when groups have different variance
What is the calculation for welches t- test?
The difference between the two means of both groups, all divided by the unpooled standard error of that difference
What is the paired/ dependant samples t-test?
- Compares the means of two dependant distributions
- computes a one sample t-test between the paired difference and zero
What is the calculation for a paired/ dependant samples t-test?
(Mean of paired differences) minus (zero) divided by (the standard error of the mean paired difference)
What is a t distribution?
the sampling distribution of t values if the null hypothesis (no effect) was true and we were able to run an infinity of t tests (as many experiments as we like)
- The higher the line of our t distribution the more probable it was to observe that particular t value in this experiment.
- Data has to be normally distributed
What does a t distributions shape depend on?
The distribution of t values wed expect to see isn’t just one distribution its a family of distribution, that distribution changes shape subtly depending on the degrees of freedom in our test (number of observations/ sample)
- At small sample sizes = we are more likely to see extreme values
What is the degrees of freedom calculation for a one sample t- test?
DF = N -1
What is the degrees of freedom calculation for an independent samples t- test?
DF = N1 + N2 - 2
What is the degrees of freedom calculation for a paired sample t- test?
DF = N - 1
What is the 6 step process to reporting a p value
- Type T (to show its a t test)
- In brackets specify the degrees of freedom
- Equals and show t value
- Report p value after test statistics
- Report exact p values to two or three decimals (p = .006)
- Report p values less than .001 as p < .001
- Report effect size after p values
What is an effect size?
Purely the size of the difference between the two groups
What statistic measures the effect size? and how do you report this
Cohen’s D
Report effect size after p values
Report exact sizes to two decimal places
What is a p value?
The probability of a chance result being at least as extreme as the one observed in your data under the assumption that the null hypothesis is true
How can a small sample effect the Shapiro wilk test?
More likely to say it’s no different from normality (more normal)
How can a large sample effect the Shapiro wilk test?
more likely to say significant difference from normality (less normal)
What is a QQ plot?
- Takes a distribution and computes its quantiles
- We can compute the quantiles of two different distributions
- E.g the quantiles of a normal distributions compared to the quantiles of our data
- Once plotted if the quantiles from the data align closely to the normal data the points should lie closely to the line
Could be a shapiro wilk alternative
When do you run rank based non parametric tests?
- Can also be used for normally distributed data
- Valid for ordinal interval and ratio data
- Valid for comparing medians rather than means
- It’s a robust measure of central tendency as it looks at the rank of the data
What is the Wilcoxon signed rank test an alternative for?
One sample and paired t test non parametric alternative
What is the step by step process of the Wilcoxon signed rank test?
- Starts with the data set subtracted by the comparison value
- First we Abs sort = the sort of the absolute values of our data by there distance from 0 (ignoring the sign)
- We then compute the ranks (e.g. smallest data point is 1, largest is 10)
- We give back the sign of the data
- Once we have the signed ranks we then make two groups, 1 for the positives and one for the negatives
- We add up the positives separately and we add up the negatives
- The group that adds up the smallest = is our Wilcoxon w
- That number is then compared to a distribution of expected ranks
- This then gives the probability of getting the result that we’ve observed (p value)
What is the Wilcoxon Mann Whitney U test?
- sorts it from smallest to largest (not ab sort) it doesn’t ignore the sign
- The smallest value (most negative) gets rank of 1 and so on
No signed rank - So we take our ranks and line them up with our original data
- Add up the ranks from the different groups
- If the sums from the groups are similar we can accept the null hypothesis (not a large effect)
How are Wilcoxon rank tests reported?
There is a test statistic (W) and p- value and an effect size (Rank biserial correlation) but no degrees of freedom