Extras Flashcards

1
Q

What is the square of the standard deviation?

A

The variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Levenes test for equality of variances is an assumption for what?

A

Independent samples T -test

One- way ANOVA

Should not be sig. Report the second line if it is sig.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How do you get a one tailed probability from your p value?

A

Divide by 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the normal error term used for the standard deviation of the sampling distribution of mean difference used in a t-test called

A

The standard error of mean difference (SEDMest)

Square root of
Variance 1 / n1 + variance 2 /n2

Variance of the sample is used which makes this error term an estimate (population variances are rarely known)

This is used to check how likely our mean difference is under our null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the pooled error term?

A

This is used when the sample sizes are unequal in an independent t-test.

This is because each sample is contributing differently to the estimates of the variance in the sampling distribution. So one sample may give you more information than another.

The pooled term weights the variances by the degrees of freedom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How to calculate the DF for the error term for a simple t-test

A

Two parameters have been estimated (variance1 and variance2)

So you subtract 2 from the total sample size

N1+N2 -2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the assumptions of independent samples t-test?

A

Population

  • normally distributed
  • have the same variance

Sample

  • independent, no two measures are drawn from the same participant
  • independent random sampling (no choosing of respondents in any kind of systematic basis)

Data (DV scores)
- at least 2 observations per sample (factor level)
Measured using a continuous scale (interval or ration)

Homogeneity of variance 
Levees tests (with equal sample sizes heterogeneity of variance and mild non-normality is no problem, e.g dice example)

If the groups are skewed in opposite direction can force a…
Non parametric alternative
Mann-Whitney u test : wilcoxon rank-sum test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Parametric tests are…

A

Calculated using an estimate of the population parameters from the samples

More restrictive, because a range of assumptions must be met? However they are generally robust to violations

In addition they are more powerful, thus we generally use a parametric test unless the assumptions are not met

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the F ratio?

A

F = MS effect / MS error

MS = mean square

F ratio is a a ration of the systematic variance (i.e. Your experimental manipulation) to the unsystematic variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

If you square a t statistic what do you get?

A

F (1, DF)

Conventionally ANOVA is never 1-tailed so choose a t-test if you want this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When you run a one way ANOVA and you get the output table the mean square box relating to the between groups = what?

A

MS effect

Mean square

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

When you run a one way ANOVA and you get the output table the mean square box relating to the within groups = what?

A

MS error

Mean square

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How to calculate the DF for the MS effect?

A

Number of groups - 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the MSerror term?

A

It is a pooled variance term, it’s a weighted average of the k sample variances

Aka MS within as it estimates variance within groups

An estimate of the error variance within the population whether or not the null is true or false

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is MS effect?

A

Variance of the k sample means multiplied by N

Estimated variance among, or between means

An estimate of the population variance when the null is true

So if null is true exp. MS effect = exp. MS error and F = 1

If null is false exp. MS effect > exp. MS error and F > 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do you calculate the DF total for the whole ANOVA?

A

N - 1

N = total number of participants

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How do you calculate DF effect?

A

K - 1

K = number of groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How do you calculate DF error?

A

K (n-1)

K = number of groups 
N = participants in one group when group = equal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the extra assumptions for ANOVA?

A

Homogeneity of population variances (variance in each of the k populations samples is the same) - levenes test (equality of error variance, we want this to be not sig. Correcting we can you boxs test… But it’s conservative so you could also transform/trim raw data).

Robustness of above
If largest variance 30)

Independence of observations (each observation is independent of every other, we randomly sample/assign, error terms are independent)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is an omnibus test?

A

Any test resulting from the preliminary partitioning of variance in ANOVA, but doesn’t tell us where the effect lies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Why is the probability of making a type 1 error is generally higher for post hoc comparisons than a priori?

A

As you are usually making more comparisons

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is a type 1 error?

A

False positive… Find an effect that isn’t real

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

There are two types of type 1 error rate what are they?

A

Per comparison rate (alpha pc or alpha)= probability of type 1 error on and single comparisons (e.g. .05)

Familywise error rate (alpha fw) = probability of making at least 1 type 1 error in a family (or set) of comparisons.

Alpha fw = 1 - (1 - alpha pc) c (this should be the power of c)

Where c is the number comparisons made and where comparisons are assumed to be independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Explain the familywise error rate through the example of a dice

A

Think of each comparison as being like rolling a fair die and imagine a type 1 error is a 6

Each comparison is one roll

What are be odds of getting a 6 on the first roll
(1/6 and 5/6 of not)

Make another roll (nb each roll is the die is independent of any other)

What are the odds now (1/6 and 5/6 the same)

So what are the odds of not getting a 6 at all over the 2 rolls

(5/6) * (5/6) = .68 or 68%

So the chance of getting at least one 6 = 32%

(Over 20 throws there is a very high chance of getting a 6 - 98% or something so this demonstrates type 1 error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Most methods use a correction which maintains alpha familywise at p

A

Bonferroni correction basically evaluates t at a
More conservative level

Alpha pc = alpha/comparisons

Aka sidak or Dunn-sidak (esp. For t test)
Alpha PC = 1 - (1 - alpha) to 1/c

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

There are two ways to report a result that has used a bonferroni correction

A

Keep alpha = 0.05 as sig, level and adjust alpha PC (p sig. Level) by multiply by c

Or adjust the alpha critical 0.05 / c

This is more common but spss does the top one

27
Q

Pairwise comparisons are t-test

discuss these

A

Just do t-test afresh on the pairs of cells of the data you want to compare

But it’s better to use to overall error term from the omnibus ANOVA as this increases power

Useful if you want to make a few specific pairwise comparisons

Assuming homogeneity of variance, use MS error from the overall ANOVA as the pooled error term and the DF error (from the whole ANOVA which will be larger) to evaluate t)

Assuming heterogeneity of variance, but equal sample sizes, use pooled error term

Or if unequal sample size use welch-satterthwaite correction (spss gives this automatically)

28
Q

What are linear contrasts?

A

Useful if you want to know if one group or set of Groups is different from another group or set of groups

Set up weights or coefficients to set up a comparison of your groups

Coefficients have to add up to 0 in linear contrasts (0 means you are not interested in a particular group)

The linear contrast
L = sum across all the cells of the coeffient weights * the means

So you basically multiple the mean for each groups by the weight you give them

This is the tested using another formula that includes the MS error

This gives you a t statistic which then is as usual checked against the critical

29
Q

What are orthogonal contrasts?

A

This is simply a set of linear contrasts that are each independent of one another

Nothing about the nature of a contrast should be influenced by any other contrast

So we want correlations between out contrast to be 0, so you have to look at the cross-products of each coefficients you want them to be 0

Imagine a set of 3 contrasts defined by coefficients a, b and c

Sum ab = 0 sum of a c = 0, sum b*c = 0

Add those all together = 0

Number of comparisons in set = DF effect (DF = groups -1 )

30
Q

How do you check if a contrast is orthogonal

A

Contrast coefficients = 0

Cross products = 0

Number of contrasts = DF effect (group -1)

31
Q

What is a problem with a within-subjects (repeated measures) design compared to a between subjects ANOVA?

A

Each participant now participates in each condition

Which violates the assumption of independence

HOWEVER
we can calculate and remove (partition) any variance due to dependence

Which reduces our error term and increases power

32
Q

Why is the MS error in a between subjects design over estimated? (In a way)

A

Because the error term contains the individual differences variance and the error term

33
Q

Is the error term in a between subjects design over estimated?

Within?

A

No because we can take out the between subjects effect

This makes the MS error smaller so the f ratio is bigger

34
Q

What is an advantage of between subjects design?

A

Increase power due to a reduced error term

And less participants required as they take part in all conditions - potentially cheaper

35
Q

What is the means squares?

A

It’s the sum of squares / degrees of freedom

36
Q

How do you calculate the DF for total sums of squares for a between subjects ANOVA?

A

N-1

N = the total number of cells

E.g 4 participants in 3 conditions = 12 cells

37
Q

How do you calculate the DF subjects for a between subjects ANOVA?

A

n-1

n = sample size

38
Q

How do you calculate the DF for within subjects for a between subjects ANOVA?

A

n(K-1)

n = sample size

And K = number of group levels

39
Q

How do you calculate the DFeffect for a between subjects ANOVA?

A

K-1

K = number of treatment levels

40
Q

How do you calculate the DFerror for a between subjects ANOVA?

A

(n-1) (k-1)

K = number of levels 
n = of participants
41
Q

What are the assumptions of one-way within-subjects ANOVA?

A

Normality
-observations are normally distributed

The error terms are normally distributed around 0

Compound symmetry:
Homogeneity of variances in levels of repeated measures factors

Homogeneity of covariances (equal correlations/covariances between pairs of levels of the factor)

42
Q

Compound symmetry is a very restrictive assumption - sphericity is a related but less restrictive assumption. What test in spss tests sphericity?

A

Mauchlys test of sphericity

Determines whether values in the main diagonal (variances) are roughly equal, and if values in the off diagonal are roughly equal (covariances)

Evaluated as chi square with DF = k-1. If sig.
Assumption is violated.

K= number of levels of repeated factors

43
Q

How can you correct for violations of sphericity?

A

Boxs adjustment
Gives us a more stringent critical value but it’s usually too conservative

Epsilon adjustment this is for the lower bound
Epsilon is simply a value by which the degrees of freedom for the test of F-ration is multiplied. You basically multiply you DF with

1/(K-1)

K = the number of repeated measures factor

Equal to 1 when sphericity assumption is met (hence no adjustment, and

44
Q

How important is the violation of sphericity?

A

In between subjects design it don’t matter because treatments are unrelated (note the assumptions of homogeneity of variance still holds)

When within-subjects factors have two levels it ain’t a problem as only one covariance is being estimated it can’t be heterogenous as there isn’t anything to compare to)

When it does matter
In all other within-subjects design
When the sphericity assumption is violated, F rations are positively biased

(Liberal F test & so Probability of type 1 error increases)

45
Q

Pooling error term is recommended for between subjects designs where possible, is this recommended for between subjects ANOVA?

(Pooled across all of the levels because this gives us a higher DF, better estimation and more power - unless heterogeneity of variance as the errors wouldn’t be similar enough)

A

No you use the error term associated with just what you are looking at

46
Q

What are some disadvantages of within subjects designs

A

Restrictive statistical assumptions (sphericity)

Sequencing effects: learning, practice, fatigue, habituation

Counterbalance to reduce sequencing effect

47
Q

What is a two-way design?

A

Two factors (discreet iv)

Each factor has 2 or more levels (male, female)

48
Q

One way design vs two designs what are the research questions?

A

One-way design: are the means of the population (of DV scores) corresponding to the factor difference?

Two-way design
Is there a main effect of factor 1?

Is there a main effect of factor 2?

Is there a factor 1 X factor 2 interaction

49
Q

List the three types of two-way designs?

A

Two-way between subjects design
2 between subjects factors

Two-way within-subjects design
1 between

Mixed design (or split-plot designs) 
1 within and 1 between
50
Q

Why bother with two-way designs?

A

Fewer participants required

Allows us to examine INTERACTIONS
Does the effect of one factor depends on levels of the other factor

The generalisability of results can be assessed - is the difference described by a main effect the same over levels of another factors?

51
Q

What are the marginal means?

A

It’s the cell means for one factor ignoring the other.

E.g. Means for males and females ignoring profession

Or profession ignoring the effect of gender

52
Q

In a two-way ANOVA what is different for the MS effect?

A

There are MSeffect for each effect

MSa - first main effect
MSb - second main effect
MSab -interaction

nB

F = MS effect / MS error

53
Q

If you get a significant interaction in a two-way ANOVA how can you understand it?

A

Sig. Interactions should be supplemented by graphs and analysing ‘simple main effects’

  • simple main effects describe differences among cell means within a row or column, or the effects of one factor at each level of the other factor
  • just like a series of one-way ANOVA conducted at each level of a factor, except the pooled error term is used (MS error)
54
Q

If you don’t look at the normal 3 main effects of a two-way ANOVA and instead run simple main effects of contrasts on no more than 3 different comparisons so you need to protect your analysis from type 1 error rate inflation?

A

No

More than 3 comparisons and you would need to, but remember to keep to as few as possible

55
Q

What is your acronym for remembering the steps in data cleaning?

A
Mysterious
Octopi
Near
Leicester 
Have
Measles 
Missing data
Outliers 
Normality
Linearity 
Homoscedasticity
Multicollinearity
56
Q

There are three types of missing data what are they and how problematic are they?

A

Missing completely at random (MCAR)

- best possible solution, should not be a problem if relatively small loss (

57
Q

How do you deal with univariate outliers?

A

Standardise variables and look at absolute values above z> 3.29

  • use histograms and box plots
58
Q

How do you test for multivariate outliers

A

Mahalanobis distance (the distance of a case from the centroid - centroid is the intersection of the variable means).

MD is tested using the chi square distribution and usually conservative alpha p

59
Q

What is the assumption of normality?

A

Is the assumption that each variable, and all linear combinations of the variable and normally distributed.

Skewness 7

Skewness is the degree of symmetry in the distribution

Kurtosis is the peaked was or flatness
Of the distribution

Check the normal probability plot or histograms

Transform
Windsorised/trim

60
Q

What is the assumption of linearity?

A

That variables have linear (straight line) relationships with each other

Can assess using bivariate scatter plots

61
Q

What is the assumption of homoscedasticity?

A

Is the assumption that variance in scores for one continuous variable is roughly the same at all levels of another continuous variable

For grouped data, the homogeneity of variance assumption: variability in DV expected to be similar across all levels of the discreet iv

For grouped data use the levenes test

For ungrouped data can inspect the bivariate scatter plots, heteroscadasticity us caused by nonnormality in one or both of the variables

Test usually robust to some heteroscadasticity

62
Q

What is multicollinearity?

A

Is a problem that occurs in a correlation matrix when variables are too highly related to each other (e.g. > .90).

63
Q

What are the problems with multicollinearity?

A

Conceptually, it indicates redundancy in the variables. Either remove or combine relevant variables

Statistically it can result in unstable matrix inversion, which can cause large standard errors or problems in solution convergence