T-test Flashcards
What is a z-score?
A z-score quantifies how many standard deviations an observation or data point is away from the mean
z = (x - μ) / σ
x = data point, u = mean, o = sd
What is the mean and the sd of a z-score?
The mean will always be 0, and the sd 1
What is one benifit of calculating a z-score?
It allows us to compare values on different scales.
What is the equation for a one-sample t-test?
t = sampMean – popMean / estimatedpop(sd) / √n]
What happens to our t-statistic if our mean and pop mean are close together? (one-sample)
One has a lower t-score as these numbers represent the numerator in the equation.
What will happen to our t-statistic if n increases?
Our t-statistic is likely to increase as well.
How is a t-distribution calculated, and how does it differ from a normal distribution?
This is obtained by averaging lots of possible choices for pop(sd). It is different from a normal distribution in that it has heavier tails.
Why does the size of the rejection region change as n increases?
Becasue the t-distribution depends on the sample size. If sample size goes down, the distriubtion looks more wide. Up, and it becomes more tall and narrow.
Why aren’t one-sided tests recomended?
Because they could be doubling your chance of a type I error.
Why aren’t one-sided tests recomended?
Because they could be doubling your chance of a type I error.
What is the same and what is different about a Welsh and Student t-test?
They both assume the population is normal, and that observations are independently sampled. Student population has equal variance, Welsh does not
Welsh is a safer test to run. Student estimates a pooled ‘sd’.
Why is there a debate around checking for normality?
Tests are fairly robust to small violations of normality. Because it is always likely to show something is not normal with large enough samples sizes.
When using a QQ-plot to asses normality, what should it look like and what is it using?
Should look like a scatter-plot where the value points represent percentiles. Normality is represented with a straight line.
limitation: have to eyeball a graph.
What is one problem with a Shapio-Wilk test?
Often significant if sample size is too large (>50).
What does a Shapiro-Wilk test print?
A W-score and a p-value.
a significant p = means not normal
When testing normality assumption, what do you have to rememeber?
That this assumption is that each group is normally distributed. Not just one, or both groups combined.
Test has to be run on each group individually.
What are non-paramatic tests? Give example + limitation.
These are tests that don’t make assumptions about the shape (normality) of a distribution. Not as powerful, can increase rate of Type II error. eg, Wilcoxon test
What does a Wilcoxon test do?
Rather than testing the residuels it looks at each data point of two groups and counts how many times the value of one group is higher than the other. W-score output.
Null: you should expect W = approx. half the possible combinations.
How should a t-statistic stat block look?
t(df) = t_sat, p-value
How do you caluclate a cohen’s d?
d = mean1 - mean2 / sd
Approx. reading of cohen’s d?
0.2 (small)
0.5 (med)
0.8 (large)
What code do you need to include to run a student test?
var.equal=TRUE
What code to run a one-sided test?
add alternative=“less” or
alternative=“greater” for t.test()