Test Revision Flashcards

0
Q

What test would you use for testing differences between groups

A

ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

What test would you use for ONE factor with two or more levels

A

One-way ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What test would you use for continuous variable(s) with 2 (or more) levels

A

Analysis of Covariance ANCOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What test would you use when you have data on two or more factors?

A

Factorial ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Standard regressions assumes….

A

That the distribution of the errors is both normal (Gaussian) and the same (constant) along the regression line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do you find the standard regression best fit line?

A

Is found by minimising the sum of squared differences from the data points to the line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Type 1 error

A

Rejecting a correct null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Type 2 error

A

Failing to reject an incorrect null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Standard deviation

A

Spread around the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is standard error of the mean?

A

Variation in the mean values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Taking another sample would probably generate a rather different mean value…under what circumstances?

A

Large SE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Taking another sample would probably give a similar result (and the mean of a given sample is likely to be close to the true mean of the whole population)
Under what circumstances?

A

Small SE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Parametric tests assume what?

A

Assume the data conforms to some underlying error distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Non parametric tests ..assumptions

A

Make few assumptions about underlying statistical distributions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Non-parametric tests three points…go

A
  • Often used when data do not conform to the assumptions of a parametric test
  • tend to lack power compared to parametric methods
  • don’t always use all the information, usually ranks or categories
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Parametric tests 4 points ….go

A
  • known as standard tests
  • collectively referred to as GENERAL LINEAR MODELS
  • when data do not conform to assumptions of normality, may have to be transformed to normalise distribution of errors
  • parametric tests that assume alternative distributions (other than normal) collectively termed GENERALIZED LINEAR MODELS
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When analysing data consisting of small counts errors are likely to be…

A

Poisson distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

When analysing data consisting of proportions, errors are likely to be…

A

Binomially distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What 6 things must be taken into consideration when designing a field experiment?

A
Blocking 
Plot size
Plot shape 
Edge effects 
Replication 
Randomisation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Blocking

A

Used to overcome variability in experimental material

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Field experiments - replication

A

The more replicates the lower the standard error, reduction in variability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Field experiments - Randomisation.

A

Must be used in allocation of treatments to units

Each treatment must have the SAME PROBABILITY of being allocated to a particular unit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Regression looks for…

A

An association between x and y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Linear regression - what is the best fit and the residuals

A

Best fit - line that minimises differences between the line and a data point
Residuals- are the differences/distance between data point and the line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Regression equation =

A

y = mx + c
where m is the slope
where c is the constant ‘intercept’ where the line intercepts the axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is r2? 4 points go!

A

The coefficient of determination
How much variation can be explained by the data
Must be between 0 and 1
Closer to 1 = most variation is explained by the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is ‘principle of parsimony’?

A

The simplest adequate description of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

F values and statistical significance…what do you do? 2 points go!

A

Compare to a critical value in statistics tables
If F is larger than the critical value for 5%, the. There is a 5% or smaller probability of the observed slope occurring in the data due to chance alone

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Regression examines relationships between…

A

Two continuous variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

ANOVA … 2points go!

A

Asks about differences between two groups

Can test differences between larger numbers of categories, regression is just 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Analysis of Covariance - ANCOVA tests for: 3 points go!

A

1) the slope (just like regression)
2) the between group differences (just like ANOVA)
3) a between group difference in the slopes …known as the interaction

31
Q

What test would you do when you have more than one discrete variable?

A

A factorial ANOVA of course!

32
Q

What test would you do when you have more than one continuous variable?

A

A multiple regression of course!

33
Q

A Multiple Regression tests for : 2 points go!

A

1) the slopes of each of the variables (just like regression)
2) differences between the slopes ie the interactions (just like ANCOVA & factorial ANOVA

34
Q

What tests come under the collective name of

GENERAL LINEAR MODELS

A
Regression
ANOVA
ANCOVA 
factorial ANOVA 
multiple regression
35
Q

What is an error? Define go!

A

E.g. The difference between height of each man in sample and the observable population mean (UK)

36
Q

What is a residual? Give example go!

A

Difference between the height of each man in the sample and the observable sample mean

37
Q

What tests assume normality ?

A
Regression
Multiple regression
ANOVA 
ANCOVA 
Factorial ANOVA
38
Q

Log-linear models assume errors are…

A

Poisson distribution

39
Q

Log-linear versions test significance using…

A

G - look up in chi-squared tables

40
Q

How do you transform data that is probably following a binomial distribution?

A

By arcsine square root transformation

But a better alternative is to use logistic models

41
Q

What do logistic models assume?

A

Assume the errors are binomially distributed

Test significance using G and looking up in chi squared tables

42
Q

Poisson distribution - small count data. 2 points go!

A

Changes shape according to its mean

The variance should = the mean

43
Q

Binomial distribution - proportional data. 2 points go!

A

Proportions must take a value 0.0 to 1.0

Distribution changes shape according to the mean

44
Q

What are the four assumptions of a regression?

A

1) each data point is independent
2) the explanatory variables are known without error
3) the distribution of errors is normal (Gaussian)
4) error variance is constant along the regression line

45
Q

Equation for regression sum of squares

A

Total sum of squares - error sum of squares = regression SSR

46
Q

Equation for degrees of freedom

A

Number of measurements (sample size) - the number of parameters estimated for the data

47
Q

R2 equation

A

Regression sum of squares (SSR) - total sum of squares (SST)

48
Q

Model Checking …

What would you look at to check normality of errors?

A

Normality Plot

- hope for a fairly straight line

49
Q

Model checking…

What would you look at to check constant variance

A

Residuals Plot

- watch out for fan shaped pattern - indicates the variance increases with the mean

50
Q

What happens if the regression is not a straight line (curvilinearity)?

A

Add a polynomial term (e.g. Quadratic term) into the regression equation

51
Q

Equation for treatment sum of squares (SSA)

A

Total sum of squares (SST) - error sum of squares (SSE)

52
Q

ANCOVA - bottom up / forwards approach

4 steps go!

A

1) begin by plotting the overall mean
2) fit one factor (e.g. Light intensity) main effect & assess significance
3) add another main effect & assess significance
4) add the interaction & assess significant difference in slopes

53
Q

ANCOVA - backwards stepwise

4 steps go!

A

1) plot the overall mean
2) fit the maximal model
3) keep removing until nothing is significant
4) if nothing is significant then the minimum adequate model is the overall mean

54
Q

What test considers two explanatory variables

A

Multiple Regression of course!

55
Q

What test would you do when you have two or more factors?

A

Factorial ANOVA of course!

56
Q

When lines cross on a factorial ANOVA graph what does this mean?

A

There is a significant interaction!

57
Q

Experimental design:
What two things would you look for if you produced a random design?
2 points go!

A

1) EXPERIMENTAL UNITS within treatments should represent a RANDOM sample from the POPULATION : unbiased and reliable
2) RANDOM ALLOCATION of experimental units to treatments

58
Q

What are the two ‘random’ designs you could adopt?

A

1) COMPLETELY RANDOMISED DESIGN - allocate plots to varieties at random SIMPLEST DESIGN - when data is homogenous

2) RANDOMISED BLOCK DESIGN - divide each block into as many plots as there are treatments & allocate varieties to plots at random within each block
Account for HETEROGENEITY

59
Q

Why is regression not suitable for a split-plot design?

A

Because a split-plot design has TWO ERROR TERMS

60
Q

Advantages of split-plot designs (2 points go!)

Disadvantages of split-plot designs (1 point go!)

A

ADVANTAGES:

1) practical considerations
2) interaction estimated more precisely

DISADVANTAGES:
1) loss of precision on the main plot

61
Q

What two models use maximum likelihood rather than least squares?

A

Log-linear and logistic models

62
Q

What does maximum likelihood do?

A

It estimates, of the parameter values (e.g. slopes/intercepts) those that would make the observed data most likely.

63
Q

How would you test for significance in a log-linear analysis?

A
  • assume Poisson distribution
  • assess significance by adding or removing terms
  • the change in deviance (G) is assessed using chi squared tables
64
Q

What is the assumption of Poisson errors?

A

Probability of the event of interest occurring a given set of conditions is RANDOM

65
Q

Model checking for Poisson distribution… HETEROGENEITY FACTOR

A

HF < 1 indicates UNDER DISPERSION

HF > 1 indicates OVER DISPERSION (common) when it’s higher than ~ 1.5 = cause for concern

66
Q

Assumption of Poisson errors:
If the overall distribution of the response variable is aggregated rather than random what model checking would you suggest?

A

NEGATIVE BINOMIAL DISTRIBUTION
- has a mean and a parameter (k) that describes the degree of clumping

k = 0 : most clumped 
k = infinity : Poisson
67
Q

Poisson distribution:

OVER DISPERSION can lead to what errors? And how would you deal with them?

A

Type 1 errors
- try and deal with it via rescaling
If it cannot be fixed with rescaling:
- Assume negative binomial error structure, transform variables or use non-parametric analyses

68
Q

Poisson distribution:

UNDER DISPERSION can lead to what type of errors? Hot would you deal with this?

A

Type 2 errors

Can be dealt with by rescaling

69
Q

What analysis does not lead to negative values or predict values greater than one?

A

Logistic Analysis

70
Q

What distribution does logistic analyses assume?

A

Binomially distributed errors

  • distribution changes shape according to the mean
  • test significance using G rather than f values
71
Q

Examples of grouped and ungrounded binary data

A

1) grouped - several coins tossed together in a group

2) ungrouped - toss of a single coin (1 data point in data set)

72
Q

What binary data do you need to worry about over/under dispersion for and what don’t you need to?

A

Need to worry - grouped binary data

No need to worry - ungrouped binary data

73
Q

Non- parametric analysis 3 points go!

A

1) makes few assumptions about underlying statistical distributions
2) tend to lack power compared to statistical tests
3) tend to be less flexible compared to parametric methods

74
Q

Give the non-parametric equivalents to the following parametric tests:

1) mean
2) standard deviation
3) one sample t-test
4) paired t-test
5) unpaired t-test
6) one way ANOVA
7) repeated measure ANOVA
8) pearsons correction test

A

1) medium or mode
2) quartiles & inter-quartile range
3) wilcoxen test, significance test
4) wilcoxen test, significance test
5) Mann-Whitney U test
6) Kruskal Wallace test or ANOVA ranked data
7) Friedman test or ANOVA on ranked data
8) Spearman’s ranked correlation coefficient

75
Q

What is the non-parametric equivalent of ANOVA on a factor with 2 levels?
And what is the extension of this test when there are more than 2 factors?

A

Wilcoxon-Mann-Whitney test
Power =~as high as the t-test

Extension = Kruskal-Wallis test
Involves comparing mean ranks of each factor level with the mean of all the ranks

76
Q

Why would error bars appear asymmetrical?

A

Due to back transformation of the logit-scale