Midterm 1 Flashcards

1
Q

What are the null hypothesis and alternative hypothesis for a one-way ANOVA?

A

H0 :μ1 =μ2 =μ3
H1 : Not all μ’s are the same

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What’s a factor in a one-way ANOVA?

A

the independent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the levels in a one-way ANOVA?

A

The different groups/treatment and control conditions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the assumptions of a one-way ANOVA?

A
  • The population distribution of the DV is normal within each group
  • The variance of the population distributions are equal for each group (homogeneity of variance assumption)
  • Independence of observations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What’s the familywise Type 1 error rate?

A

The probability of making at least one Type 1 error in the family of tests if the null hypotheses are true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What’s a family of tests?

A

a set of related hypotheses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does the Overall F-test or first test of ANOVA tell us?

A
  • Overall F-test evaluates is H0 false?
  • If the overall F-test is significant then we use post-hoc tests to look at pairs of groups
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What kind of ratio does ANOVA give us?

A
  • F ratio
  • ANOVA gives us a ratio of variance due to group membership over variance that is not explained by group membership (MSm divided by MSr)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is variance explained by the model (MSm)?

A

Between-group variance that is due to the IV, or different treatments/levels of a factor -> variance accounted for by group membership

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is residual variance (MSr)?

A
  • Within-group variance that can’t be accounted for by group membership
  • Within each group, there is some random variation in the scores for the subjects
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How are the F statistic and degrees of freedom presented?

A

F (dfM, dfR) = x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What kind of distribution is the F distribution?

A

A right-skewed distribution used most commonly in ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

When can you reject the null hypothesis in an ANOVA test?

A

If your F value is greater than or equal to the critical value, you may reject the null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does the F ratio relate to the t statistic?

A
  • With only two groups, either a t test or an F test can be used for testing for a significant difference between means
  • Both procedures lead to the same conclusion
  • When the number of groups is 2, then F = t^2
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In ANOVA formula, what does X-bar stand for?

A

The grand mean (across all observations)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In ANOVA formula, what does i stand for?

A

An observation (coming from N total observations)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

In ANOVA formula, what does g stand for?

A

A group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

In ANOVA formula, what does k stand for?

A

Total number of groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

In ANOVA formula, what does Ng stand for?

A

Size of group g

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

In ANOVA formula, what does Xbar-g stand for?

A

Group mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

In ANOVA formula, what does Xig stand for?

A

Xig - observation i in group g

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does SSt stand for?

A

The aggregate variation/dispersion of individual observations across groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are MST , MSM , and MSR often called?

A

the total, model (between-group), and residual (within-group) Mean Squares, respectively

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Which effect size is more commonly reported in ANOVA?

A

η2 (eta squared)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What do the effect sizes (pearson R, eta squared and omega squared) all look for?

A

Proportion of variance in the DV that is explained by the IVs

26
Q

What’s the difference between
eta squared and omega squared?

A
  • η2 is positively biased (overestimates the amount of variance explained in the DV by the IVs)
  • ω2 is unbiased
27
Q

What are the cut-offs for the effect size of
omega squared?

A
  • Small ≈ .01
  • Medium ≈ .06
  • Large ≈ .14
  • Report ω2, even if it’s negative
28
Q

What does fully-crossed mean in a factorial design?

A

That the factor levels are multiplied by each other (ex: factor 1 has 3 levels and factor 2 has 3 levels then it’s a 3x3 factorial design with 9 treatment conditions)

29
Q

What elements should be included in the APA style analysis conclusion (in order)?

A
  1. 1-2 sentence overview of analyses that includes the independent and dependent variable, stated conceptually.
  2. Description of overall results of F -test, in a particular format, including effect size measure
  3. Description of the pattern of mean differences among groups, including whether significant differences were found (M for mean and SD for standard dev) -> when working with 3 groups ANOVA test, we’ll have to conduct post-hoc tests to evaluate which pairs of groups have significant mean differences
  4. A conceptual conclusion
30
Q

Provide an example of what elements should be included in the APA style analysis conclusion (in order)?

A
  1. To investigate whether level of fitness (low versus high) had an effect on ego strength (with higher scores indicating more ego strength), we conducted a one-way between-subjects ANOVA
  2. This analysis revealed a significant effect of fitness on ego strength,
    F (1, 8) = 5.32, p < .05, ω2 = .61
  3. Participants in the low fitness group (M = 4.40, SD = 0.92) had significantly lower ego strength than those in the high fitness group (M = 6.36, SD = 0.55)
  4. We conclude that having high as opposed to low fitness may increase ego strength
31
Q

How to report numbers in APA format?

A
  • 2 decimal places
  • 3 decimal places for p-values
32
Q

True or False: with two groups the results of an independent samples t-test and a between-subjects ANOVA on the same data set will always agree

A

FALSE: they could disagree they use a different value of α

33
Q

What are assumptions of a single mean z-test?

A
  • The variable, X, in the population is normally distributed
  • The sample must be a simple random sample of the population (independence of observations)
  • The population standard deviation, σ, must be known
34
Q

What are the effect size cut-offs for r?

A

0.10 -> small effect
0.30 -> medium effect
0.50 -> large effect

35
Q

What does a 95% Confidence interval mean?

A

If we repeated our experiment many times, 95% of the time a 95% CI will contain the true effect

36
Q

What does the p-value represent?

A

The p-value represents the proportion of data sets that would yield a result as extreme or more extreme than the observed result if H0 is true

37
Q

What are the effect size cut-offs for r squared?

A

0.01 -> small
0.09 -> medium
0.25 -> large

38
Q

What are the effect size cut-offs for cohen’s d?

A

0.2 -> small
0.5 -> medium
0.8 -> large

39
Q

What are the assumptions in between subjects ANOVA?

A
  1. Independence of observations
  2. Identical distribution (within group)
  3. Identical distribution (between groups)
  4. Homogeneity of variance
  5. Normal Distribution
40
Q

Describe the formula Yij =μ+αj +Eij

A
  • Formula describing the linear model underlying everything we do in ANOVA
  • Yij = person i’s score on the outcome Y and this person i belongs in group j -> Y is the dependant variable
  • Eij -> experimental error - something that allows individual scores of people in that population to vary from this group mean (assumed to be normal)
  • Eij is random, but mu + alpha-j is fixed for every member of that population
  • In this equation, mu + alpha-j is constant for every person in the population (one population = one mean)
41
Q

The assumptions about normality and equal variances are assumptions about what?

A
  • The population
  • Usually we can examine the sample for evidence about whether these assumptions hold
42
Q

What are some methods for Assessing Normality?

A

Descriptive and Inferential Statistics:
- Looking at the mean, median, mode
- Tests for skewness (testing whether skewness is significant -> normal distribution has skew of 0, any type of skewness means that the distribution isn’t perfectly normal)
- Kolmogorov-Smirnov and Shapiro-Wilk tests

Visual methods:
- Histograms
- Normal Quantile (Q-Q) Plot

43
Q

Describe tests for skewness when assessing normality

A
  • Skewness represents symmetry and whether the distribution has a long tail in one direction
  • Left (negative) skew = Mean < Median
  • Symmetric (normal) = Mean = Median
  • Right (positive) skew = Median < Mean
  • Skewness should be ~0
    > 0 - positive/right skew (longer right-hand tail)
    < 0 - negative/left skew (longer left-hand tail)
  • Also look at standard errors (SE skewness)
  • Conducting a significance test for whether skewness is significantly different from 0
  • To compute this, we will get an estimate of skewness of our variable, divided by the standard error, and then compare this against a value of 3.2 in absolute value
  • Reject the null hypothesis that skew is 0 in the population if the ratio tskewness is greater than 3.2 in absolute value
  • Here we don’t want to reject the null hypothesis because rejecting it would mean we have found evidence that our scores aren’t normally distributed
44
Q

What’s the more unbiased estimate of central tendency?

A

Median, rather than the mean

45
Q

What are the statistical tests of normality?

A
  • The Kolmogorov-Smirnov (K-S) test
  • The Shapiro-Wilk (S-W) test
  • If a test is significant, reject the null hypothesis that the distribution of the variable is normal
46
Q

What’s the Kolmogorov-Smirnov (K-S) test?

A
  • Very general, but usually less power than Shapiro-Wilk (S-W) test
  • Conceptually, compares sample scores to a set of scores generated from e.g., a normal distribution with the sample mean and standard deviation
  • Used to see if the scores on your variable follow any distribution you think they follow
  • Conceptually, this test takes your observed scores on the variable and it compares them to quantiles from this reference distribution you’re trying to assess whether it’s appropriate for your data
  • If there are large departures from the quantiles from the reference distribution and your observed scores -> this would be evidence against your scores following the distribution you think they follow
47
Q

What’s the Shapiro-Wilk (S-W) test?

A
  • Usually more powerful, but only for normal distributions
  • Follows a similar logic to the Kolmogorov-Smirnov (K-S) test
48
Q

What are limitations of the normality tests and solutions to overcome these?

A
  • It’s easy to find significant results (reject null hypothesis that data is normal) when sample size is large
  • Same with skewness tests -> as the sample size gets larger, SE gets smaller and with smaller SE, you’re more likely to get a t ratio value larger than 3.2, even with small values of skewness
  • Solution: do the tests, but plot data as well and examine the histogram for evidence of multimodality, extreme scores (outliers), and asymmetry
  • More than one mode is evidence of deviation from normality
49
Q

Describe the use of histograms to assess normality

A
  • Create separate histograms for each group to assess normality
  • Look for obvious signs of non-normality
  • Doesn’t have to be perfect, just roughly symmetric
  • Multiple modes may suggest that there are different subpopulations in the sample
  • If that’s the case, include a classification variable as an additional factor in the ANOVA
50
Q

Describe the use of normal quantile plot (or normal probability plot or Normal Q-Q plot) to assess normality

A
  1. Compute percentile rank for each score
    - Sort observations from smallest to largest
    - What percentage of scores are below score X?
  2. Calculate (theoretical or expected) z-scores from percentile rank
    - If the scores were normal, what would the z-score be?
    3 Calculate actual z-scores
    4 Plot the observed vs. theoretical z-scores
    - We get some percentiles from the z-distribution and we see how much our observed z-scores deviate from the percentiles from the normal distribution
    - If the data are close to normal, then the points will like close to a straight line
51
Q

What do violations of the assumption of normality lead to?

A
  • Non-normality tends to produce Type I error rates that are lower than the nominal value
  • Depending on the context of the research study, this may be less concerning than an assumption violation that results in excessive Type I error rates (above the nominal value α)
  • When we select an alpha of say .05, we’re saying that if the null hypothesis is true, 5% of our findings in the long run will be false positives
  • If you don’t meet the assumption of normality and you pick an alpha level of .05 -> less than 5% of your results in the long run will be false positives if the null hypothesis is true
  • This means you have lower power to detect differences if there is an effect in the population
  • A consequence of the violation of the assumption of normality is that you might miss some effects (not inflating type 1 error rate but you are decreasing your power)
52
Q

Type 1 error rate and what go hand in hand?

A

Type 1 error rate and power go hand in hand (as one increases so does the other)

53
Q

What’s the assumption of homogeneity of variance?

A

Assuming that all of the group variances are equal

54
Q

What does violation of the assumption of homogeneity of variance lead to?

A
  • Serious violation of this assumption tends to inflate the observed value of the F statistic
  • Too many rejections of H0 = high Type I error
  • This is a more problematic assumption because if you violate this assumption, you will inflate your type 1 error rates
  • If you select an alpha of .05, but your assumption of homogeneity of variance is not met, you may end up with more than 5% of false positives if the null hypothesis is true
55
Q

What are the different tests that assess homogeneity of variance?

A
  • The Fmax test of Hartley
  • Levene’s test
  • Brown and Forsythe test
56
Q

What’s the Fmax test of Hartley?

A
  • Fmax = ratio of largest group variance to the smallest group variance
  • Calculate the sample variance for each group, and find the largest and smallest variances
  • Compute Fmax:
    Fmax = maxs2g mins2g′
  • The observed Fmax value is compared against a critical value of this statistic
  • If the assumption of homogeneity of variance is satisfied, Fmax ratio would be close to 1
  • If the observed value of Fmax exceeds the critical value, we conclude that we have to reject the null hypothesis and the assumption is not met
  • Easy to compute, but assumes that each group has an equal number of observations
57
Q

What’s Levene’s test?

A
  • Measures how much each score deviates from its group mean
    Zij =|Yij −Ybarj|
  • Instead of using the original scores Yij to run the ANOVA, you use the absolute deviation scores Zij
  • If we retain the null hypothesis, we can conclude that the assumption of homogeneity of variance is met
  • The downside of this test is that it’s easier to obtain a significant F-ratio for this ANOVA when your sample size is large
58
Q

What’s the Brown-Forsythe test?

A
  • It measures how much each score deviates from its group median
  • The median is less weighed by outliers than the mean and isn’t pulled by a skewed variable
  • Zij =|Yij −Mdj|
  • Instead of using the original scores Yij to run the ANOVA, you use the absolute deviation scores Zij
  • For both the Levene and Brown-Forsythe tests a statistically significant finding (e.g., p ≤ .05) leads to the conclusion that the variances are significantly different across groups (i.e., the assumption of homogeneity of variance is not met)
  • The Brown-Forsythe test is slightly more robust than Levene’s test
59
Q

For both the Levene and Brown-Forsythe tests a statistically significant finding (e.g., p ≤ .05) leads to what conclusion?

A

That the variances are significantly different across groups (i.e., the assumption of homogeneity of variance is not met)

60
Q

Which test is recommended more than the other: Brown-Forsythe test or Levene’s test?

A

Brown-Forsythe test is recommended over the Levene’s test

61
Q

What are the 5 assumptions in ANOVA?

A
  • Independence of observations (random sampling)
  • Identical distribution (within groups) (random sampling)
  • Identical distribution (between groups)
  • Homogeneity of variance
  • Normal distribution