PSCH 443 - Final Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Basic Logic of NHST

A
  1. Belongs to null hypothesis significance testing - evaluates the probability of observing the data under the assumption that the null hypothesis is true.
  2. If we assume the null is true, we can generate a sampling distribution that characterizes the distribution of the sampling mean we expect to observe.
  3. By looking at the mean and the expected total variance, we can guesstimate the difference b/w an observed sample mean and the population mean when a sample error is made.
  4. We can use this difference to determine the likelihood of seeing an actual effect versus any difference being the result of sampling error alone.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Grand Mean (GM)

A

The combined mean of all the different means for each group or condition used in the ANOVA.

  • We can take the average squared deviation of all the means
  • Estimates the population variance
  • Estimates the expected distribution over an infinite number of samples
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Post Hoc Tests

A
  • Used when we do not have a theoretical basis to expect any particular differences b/w groups or conditions
  • More conservative measures of differences b/c they are not guided by theory
  • Used when there is a significant omnibus F stat, but no specific differences b/w groups were originally predicted
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Two categories of Post Hoc Tests

A

Fall into 2 broad categories:

  1. Adjusting type I error rate to accommodate multiple comparisons
  2. Calculating new and more conservative test statistic
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When using Bonferroni correction:

A
  1. Calculate a new alpha

2. Takes the desired level of family wise error for an experiment (i.e., 0.05) and divides it by the # of comparisons

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

When using a Tukey HSD we:

A
  1. Calculate a new test statistic that represents the mean difference that must be reached in a comparison to be statistically significant
  2. Assumes we want to compare all means
  3. Uses 0.05 as an arbitrary cut-off
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Planned Comparisons

A
  • Planned b/c they should be guided by theory; # of planned comparisons is generally small relative to the # of conditions b/c this reduces family wise error by default
  • Tests are only made b/w a few groups that have key differences as opposed to there being several tests across several conditions
  • Uses the error term from the omnibus f test, or the Within Groups Mean Squares
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The two types of planned comparisons:

A

2 types:

Pairwise – analyze simple differences b/w 2 means
Complex – analyze the difference b/w sets of means

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What do we do in complex comparisons:

A

In complex comparisons, we need to come up w/ contrast weights. Contrast weights are sample means weighted by a coefficient

  1. Choose sensible comparisons
  2. Groups with positive weights will be compared to those with negative weights.
  3. The sum of the weights should always be zero.
  4. Groups not involved in a comparison always get a coefficient equal to zero
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Family-wise Type I Error

A

Inflated probability of making a type I error based on greater # of tests performed

  1. Reflects that multiple tests are independent and have their own unique probability of committing a type I error (or of incorrectly rejecting the null)
  2. Sums the total of all tests performed
  3. Subtract them from 1 to standardize the probability of committing a type I error (or of incorrectly rejecting the null)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

ANOVA as Regression

A

Can be understood as:

Systematic Variation + Unsystematic Error / Unsystematic Error
  • Both try to explain variability although model estimation is different
  • Focuses on categorical variables
  • If mean differences are larger than what we expect due to chance (error), the value of the F stat should increase
  • The systematic variance that our model explains = the effect of our IV on our DV
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Partition of ANOVA

A
  1. SSt represents the overall variability we are trying to explain
  2. Partitioned into SSm (variance accounted for) and SSr (variance unaccounted for)
  3. Unsystematic variance cannot be explained for in any meaningful way using ANOVA models
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Dummy Coding ANOVA for a regression analysis

A
  1. We enter all dummy codes in one block
  2. Comparison group is given a value of 0
  3. Other groups are given a value of 1 in each row
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Orthogonal comparisons

A
  • Info given by the comparisons is independent of other comparisons ran on the data
  • Sum of weighted comparisons has to be equal to 0 to maintain independence
  • Does not inflate familywise error b/c outcomes are treated independently so no test type I error probabilities are overlapping
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Assumptions of ANOVA

A
  1. Normality
    - The distributions of the residuals are normal
  2. Homogeneity of variance
    - Variances should be roughly equal across groups
  3. Independence of observations
    - The error term is the same across all values of the independent variables
    - Spread is roughly the same across levels, so there is about equal random error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Eta-squared

A

The most generally accepted measure of effect size/statistical power

  • Will be = to R2 in one-way ANOVA
  • tends to overestimate the effect size in the population
  • the inverse of type II error, or the likelihood that we will detect a significant effect when none exists
  • smaller range for effect size = better chance of detecting the effect
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Factorial ANOVA

A

Research designs that have more than one IV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Main Effect

A
  • The effect of one IV on the DV
  • Breaks downs the sums of squares for the model into sums of squares for the main effect of variable A and sums of squares for the main effect of variable B
19
Q

Interaction

A
  • The way that one IV affects the DV
  • Depends on the level (condition) of the other IV and how the two measures are interrelated
  • Need to be evaluated using comparisons in relation to both theory and what best matches interpretation
20
Q

Between Subjects Designs

A

Between Subjects design have the following features:

  1. Typical experimental procedure
  2. Each level of the independent variable is assigned to different groups of people
  3. Comparisons are between different groups
  4. Requires larger number of participants b/c statistical power is lower
21
Q

Within Subjects Design

A

Within Subjects designs have the following features:

  1. The independent variable represents more than one assessment of the same group, under different conditions
  2. Statistical power increased: requires fewer subjects.
  3. Other IVs may be between subjects (Mixed Designs)
22
Q

Sums of Square Within Participant, SSW:

A
  • Reflects both systematic variation due to treatment and unsystematic differences across individuals
  • Calculate between condition sum of squares
  • We subtract SSM from SSW to get our new error term SSR
23
Q

Sphericity

A
  • Refers to the equality of variances of the differences between treatment levels
  • The variability across people is relatively uniform across the levels of the IV
  • If sphericity is violated, the omnibus error term may be too liberal for some comparisons and too conservative for others
  • Huge issue for post hoc tests; Bonferroni is generally considered the safest
24
Q

Carry-over effects

A
  • Tests subjects under all conditions, which can cause effects based on the order of the conditions
  • Can prime subjects to respond in a specific way based on past experience w/ the testing conditions of the experiment
  • Counterbalancing reduces the likelihood of carry-over effects b/c order is less of a problem when conditions are presented randomly
25
Q

Partial Eta Squared

A
  • The ratio of variance accounted for by an effect and that effect plus its associated error variance
  • Sums of squares of effect / sums of squares for effect + Sums of Squares error term
  • Measures effect size w/ other variance of effects partitioned out
  • Used when participants in levels are the same for different conditions or experimental tests
  • Part of repeated measure designs
26
Q

Discuss the logic of the F-ratio. What is it about the structure of the F-ratio that enables us to evaluate our null hypothesis while also allowing for the possibility that differences in our condition are not due to chance?

A

F-ratio is the ratio of the population variance as estimated between groups vs. within groups. There is a certain amount of variance in our experience that will always be the result of sampling error, or variance between the means b/c baseline fluctuates per sample.

  • A bigger sample size = more stable measure of group variance
  • Larger amount of overall variance = more likely to observe a true effect
  • If null is true, then both within and between groups estimate should equal roughly 1 b/c the means are the same
27
Q

What is the f-ratio?

A

When we take the means between groups and divide them by the within groups variance (or our total variance), we can find if our effect appears to exceed a 5% chance of type I error. If it does, we know that the effect is unlikely to be due to chance b/c it is unlikely such a larger amount of variance is not actually a product of true mean differences.

28
Q

Under what circumstances should we use post hoc tests? What are the advantages and disadvantages to post hoc tests?

A
  1. Post Hoc - no theoretical basis to assume there is a difference b/w means. Not planned before the experiment; generally done to test variations in the result after-the-fact.
  • Conservative
  • Used when there is a significant f-ratio but no specific differences were predicted
  • More difficult to reach significance
29
Q

Under what circumstances should we use planned comparisons? What are the advantages and disadvantages to planned comparisons?

A
  1. Planned comparisons - based on theory and interpretation, and chosen before conducting an experimental procedure.
  • generally a small # of tests to avoid inflating familywise error
  • Uses several comparisons as controls and only examines some
  • Can be pairwise or complex - analyzes the differences b/w pairs or sets of means
  • If orthogonal, they can be very robust tests
  • Can be ran even if omnibus F is not significant, which is a huge advantage
  • Has to have own error term created in order to find if the comparison is too conservative or too liberal
30
Q

Why can’t we just do a bunch of t-tests to follow up a significant omnibus test?

A

We cannot run multiple t-tests b/c it would inflate the probability of committing a type I error. In other words, we would have too much family wise error and increase the risk of misidentifying an effect when there is not one.

31
Q

What is the relationship between the F-ratio and the t statistic?

A

The sampling distribution for a t-test was based on mean differences, and the t-distribution itself is based on mean differences relative to the sampling error. In ANOVA, the sampling distribution is based on a ratio called the F-ratio.

32
Q

What is interesting about the value for the f-ratio and the t-statistic?

A

The f-ratio is the value squared t statistic. By association, the t stat is the square root of the f-ratio. T-tests evaluate mean differences b/w groups, whereas f-ratio evaluates them relative to all the variance and is standardized.

33
Q

Why is it generally desirable to run orthogonal planned contrasts? When might it be permissible to run non-orthogonal comparisons?

A

It is good to run orthogonal planned comparisons b/c if the tests are independent of one another, the variance will not be inflated by having to assume the existence of multiple type 1 errors within our design. Each time we have to assess the probability of a type 1 error, we compound them based on any other tests.

  • we can run non-orthogonal comparisons when running independent comparisons would not make theoretical sense, or we have no theoretical justification for them to be orthogonal
34
Q

Why might it be permissible to run a planned comparison even if your omnibus F is non-significant?

A

It can depend on how the different variables interact with each other. If an effect on the dependent variable is small, sometimes the omnibus f-test is not sensitive enough to detect it in comparison to the overall low change in variability our model accounts for.

  • perhaps possible when testing a hypothesis that needs a comparison?
  • when running few tests b/c difference is being washed out
35
Q

Jacob Cohen’s suggested guidelines for evaluating effect size

A

Effect size can be broken down into three general categories.

  1. Small effect –η2 = .01
  2. Medium effect –η2 = .06
  3. Large effect –η2 = .15
36
Q

What is a shortcoming of Cohen’s guidelines for effect size?

A

A shortcoming of this approach is that these guidelines are arbitrary. We do not have a good set point for what constitutes a strong effect or otherwise. They only work in the absence of any other useful information.

37
Q

What factors should be considered when reviewing effect size?

A

When evaluating the practical significance of effects, we want to remember sample size. If we have an idea of how large an effect might be, we can determine the amount of subjects needed. In general, a larger sample size increases power and helps reduce the likelihood that we will fail to detect an effect if one does exist.

38
Q

When we conduct a factorial analysis of variance, we usually do not interpret the main effect and instead focus on the interaction. Why is it important that we focus on the variable insofar as it is involved in the interaction and not as a main effect?

A

Main effects only look for general mean differences, which just demonstrates that the dependent variable is related to the independent variable in some way. It does not outline where or in which ground/condition this difference originates from.

Interactions help illustrate how the variables effect is quantified based on some other second or third or nth variable; interactions imply the main effect is the result of something else, such as treatment or condition for example, and it has to be interpreted as such.
39
Q

When we have a factorial research design, we compute several estimates of between groups variance. Why?

A

To partition the amount of variance our model accounts for versus total variance. We then have to break it down further to see what portion of variance main and interaction effects constitute. We do this by using the variance to parse out unsystematic or random error to estimate each components effectiveness.

40
Q

When we have a within subjects factorial design, we compute several estimates of error variance. Why is this necessary?

A

To partition the amount of variance our model accounts for versus total variance for all participants. Basically the same as a typical factorial research design, but we get the mean differences for each participant as well.

41
Q

Why do within subjects factorial designs typically have far more statistical power than between subjects designs? Where in the statistical analysis is the advantage realized?

A

Within subjects designs have more statistical power b/c it observes changes within the same population of participants, so sampling variation stays the same.

The differences in the means reflect only systematic error if properly partitioned, and this advantage is realized when analyzing the sum of squares within participant for the independent variable.

42
Q

Be able to define a set of orthogonal comparisons for a set of condition means.

A

Choose sensible comparisons

Groups with positive weights will be compared to those with negative weights

The sum of the weights should always be zero

Groups not involved in a comparison always get a coefficient equal of zero

43
Q

Be able to articulate the logic of a within subjects ANOVA.

A
  1. Eliminates major source of unsystematic error variance by using same population
  2. Each participant is acting as his or her own experimental control across conditions
  3. Able to partial out the unsystematic error from the systematic error in our analysis
  • get variance for Sums of Square Within Participant, SSW
    SSW = the extent to which individuals tend to vary across conditions
  1. Because the same people are compared across every condition SSM is now relatively free of unsystematic error
  2. If we partial the sums of squares for our IV from the sum of squares within participants, we have a new measure of error variance
  3. SSR = SSW - SSM