Week 4: T-Tests and ANOVA Flashcards

1
Q

What is the purpose of a t-test?

A

To compare the means of two groups and determine if their difference represents a real difference in the population or occurred by chance.
T-tests are used to quantify how far apart the two means area. In other words, how many SEs is the observed ‘difference in the sample means’ away from zero?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When should the t-distribution be used instead of the z-distribution?

A

When the sample size is small or the population variance (σ) is unknown

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the three types of t-test?

A

Paired t-test; Independent t-test (equal or unequal variance); One sample t-test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are degrees of freedom in a t-test?

A

The number of values in a calculation that a free to vary, typically N - 1 for a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

List the steps for hypothesis testing

A
  1. Defne H0
  2. Define H1
  3. Choose a significance level (α)
  4. Select and calculate the test statistic
  5. Compare the test statistic to the critical value or p-value
  6. Interpret results
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the H0 and H1 for a t-test?

A

H0 = μA - μB = 0 (the means of the two groups are the same)
H1 = 𝜇A ≠ 𝜇B (the 2 samples come from different populations)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What assumptions are made for parametric tests like t-test and ANOVA?

A

Normality of data distribution; Homogeneity of variance (equal variance)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is ANOVA used for?

A

To compare the means of three or more groups (e.g., μ1, μ2, μ3) to determine if at least one mean differs significantly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the F-statistic in ANOVA?

A

The ratio of between-group variance to within-group variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When is a non-parametric test preferred over a parametric?

A
  1. When the sample size is small
  2. When data are non-normal or contain outliers
  3. For analysing ordinal or ranked data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define the critical value for a t-test

A

A threshold that the test statistic must exceed to reject H0 at a given α

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the pooled variance in an independent t-test?

A

A weighted average of the variances of the two groups, used when assuming equal variances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does a post-hoc test in ANOVA do?

A

Identifies which specific means differ after finding a significant F-test result (e.g., Tukey post-hoc test)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the advantages of parametric tests?

A
  1. Greater statistical power for detecting differences
  2. Robust to violations of normality if the sample size is large
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are advantages of non-parametric tests?

A
  1. Valid for small sample sizes and non-normal data
  2. Can handle ordinal data and outliers effectively
  3. Assess the median which can be better for highly-skewed distributions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What happens if multiple t-tests are used instead of ANOVA?

A

It increases the risk of Type I errors due to the inflation of the overall α
The probability of making one or more Type I errors is 1 - (1-.05)^3 = 0.14

17
Q

What is the difference between z and t distribution?

A

t is more spread out than Z (heavier tails)
The sample distribution of t depends on n (and has n-1 df). The larger the sample to estimate σ (df→∞), the closer is the t distribution to the normal distribution (z statistic)

18
Q

All other factors being equal, is a large ¯d more or less likely to occur by chance than a small mean difference?

A

Less

19
Q

Given the same ¯d, are groups with smaller SDs more or less likely to report a significant difference than groups with larger SDs?

A

More

20
Q

What’s the relationship between t-tests and sample sizes?

A

The more subjects, the more confident we can be that the differences we find did not occur by chance

21
Q

In a t-test, what do we need to assess how likely the difference between means is to be “real”?

A
  • The mean difference, ¯d
  • The SD for each group
  • The number of subjects in each group
22
Q

When is there strong evidence against H0 for a t-test?

A

t_obs ≥ t_ critical at a certain α

23
Q

What is a paired (or dependent) t-test?

A

Difference between average scores of a single sample of individuals who is assessed at two different times or on two different measures. It can also compare average scores of individuals who are paired in terms of a particular characteristic.
In the population, the two means measured at different times might either:
- be the same (μt1= μt2 → μt1 – μt2 = δ = 0)
- be different (μt1 ≠ μt2 ; μt1 - μt2 < 0 : μt1-μt2 > 0)

24
Q

What is an independent t-test?

A

Compares the means of two samples selected independently of each other (i.e., subjects in the two groups are not the same). There are then two types of independent t-tests: equal/pooled variance and unequal/separate variance.
In the population, the two means measured among two different groups might either:
- be the same (μA = μB → μA – μB = δ = 0)
- be different (μA ≠ μB ; μA - μB < 0 ; μA - μB > 0

25
Q

What do we assume with t-tests?

A

No difference between the means.
Is the difference in the sampling means (¯d= x̄t1– x̄t2) significantly different from zero?

26
Q

What is the difference between one- and two-way ANOVA?

A

One-way: One characteristic/variable to define three or more groups of data
Two-way: Considers two independent variables

27
Q

What are the null and alternative HPs for ANOVA?

A

H0: μ1 = μ2 = … =μg (g ≥ 3)
H1: not all population means are identical or, at least one mean differs from the others

28
Q

Why consider variances with ANOVA?

A

The “relative location: of the group means can be more easily identified by variance among the group means than comparing many group means directly (when # groups large)
The ANOVA method assesses the relative size of variance among group means (between-group variance) compared to the average variance within groups (within-group variance)

29
Q

What is the main interest of ANOVA? Why?

A

The ratio of between-group variance to within group variance. If the ratio is greater than expected by chance, we may think that at least one mean is different.
When between-group variances are the same, mean differences among groups seem more distinct in the distributions with smaller within-group variances compared to those with larger within-group variances

30
Q

What distribution is the F statistic known to follow?

A

F distribution with
df1=(#groups-1)
df2=(n-#groups)
To test our HP, we compare the F value calculated with the critical value at an α error level (0.05) in the relevant F table

31
Q

What does a larger F value imply?

A

The means of the groups are different from each other compared to the variation of the individual observations in each group
If the F value is larger than the critical value, there is strong evidence against H0 (differences between group means are larger than what would be expected by chance if the H0 was correct. At least one group mean differs from the others)

32
Q

What are the non-parametric alternatives to t-test and ANOVA?

A

T-test: Mann-Whitney U test; Wilcoxon T-test
ANOVA: Kruskal-Wallis test

33
Q

How do you decide whether to use a parametric or non-parametric test?

A

If the sample size is tiny, you may be forced to use a non-parametric test.
For larger samples, if the mean is a better measure of central tendency for the distribution of the data, a parametric test is usually better. If the median is a better measure, consider a non-parametric test (regardless of the larger sample size).
If some assumptions are violated, check both