part one Flashcards

1
Q

what are the two types of variability

A
  1. intrinsic (natural system)

2. extrinsic (measurement error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

when do you define the population

A

population must be defined before the sampling proess has begun as it will dictate how you sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how are frequency curves characterised

A

by 2 key parameters: location (eg middle, mean, mode) and dispersion (spread, variance)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

why are the parameters of the frequency curve important

A

we can never know the true population parameters therefore we can infer from sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is mu (μ)

A

population mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is x

A

sample mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is σ

A

population standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is s

A

sample standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is SEM

A

standard error of mean measure variabilty of the sample mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are the 6 main steps of the logical framework ?

A
  1. observations
  2. models
  3. hypothesis
  4. null hypothesis
  5. experiment and sampling
  6. interpretation and results
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is the next step after you retain the null

A

you refute model and hypothesis. therefore you go back to observations and find out what was missing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is the next step after you reject the null

A

you retain model and hypothesis. you dont stop. you ask why is this the case ie what are the mechanisms that make this model true?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what are the 2 types of observations

A
  1. casual (personally seen in nature with no prior knowledge

2. previously quantified in literature

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what types of phrases must be used when making casual observations

A

it appears, it seems, it looks like

ie not certain

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is a model

A

the reason behind observations used to explain process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

how do you state model from a casual observation

A

it is correct because it happens in nature in this location where i saw it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

how do you state a model from a quanitifed observation

A

literature behind process eg this is because…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what is a hypothesis

A

what you predict if the model is true.

use the structure… if I do this then i will observe this/i will expect this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what is the difference between mensurative and manipulative

A

mensurative experiments are observational, they do not change the experiment.
manipulative experiments change system to understand patterns (you need literature first)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what is the point of a null hypothesis and what is this approach called

A

falsificationist approach. the hypothesis can never been proved because the population cant be measured. therefore you test everything outside of the population and what remains is true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

limited by its design, a mensurative study can only give certain interpretations. what are they

A

correlative not causational

it doesnt let you understand cause and effect or mechanisims. merely descriptive/qualittative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what is required for appropriate manipulative studies

A

appropriate controls and adeuqate prior biological knowlegde of the system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

what is the difference between precision and accuracy

A

precision is the measure of spread, (precise = narrow, imprecise = wide) you can test using standard error of mean

accuracy is the measure of how close the sample mean is to the population mean (usually you cannot test accuracy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

SEM

A

s/sqrt n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

when should you use random sampling

A

when information is not known about the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

when should you use statified sampling

A

when you know information about the population to best represent that population . this increases precision and accuracy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

is random sampling always represenative

A

no
by chance it can be or it can not be. therefore preliminary tests can be performed with lots of replicates to decide how to represenativly sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

assumptions that must be accounted for PRE sampling

A

independance
randomness
these are KEY

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

assumptions that are analysed POST sampling

A

homogeneity of variances

normality of residuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

how to ensure independent data

A

replicates need to be indepenent of each other (eg seperated through space, look for possible relationships between replicates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

what is psuedo replication

A

‘replicates’ that are non-independant on each other therefore not really true replicates as you are not accounting for relationships between individuals. this increases type 1 error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

what is confounding

A

when you reject the null (ie your hypothesis is supported) however this is not because your model is correct rather you have not accounted for other factors/variables that cause this relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

how to mitigate confounding effects

A

by performing a manipulative study where you can control the variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

why do we perform statistical tests

A

as we are taking a sample of the population that is subject to error, we can only make probalistic statments rather than absolute statments. statistcis allows us to quantify

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

what are the three components of a statistical test

A

a null hypothesis
a test statistic
rejection region and critical value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

what is the logical null

A

everything not included in hypothesis (eg equal or opposite)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

what is the statistical null

A

there is no difference between groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

what is the t-test

A

testing the difference between 2 means

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

when do you use 2 tailed t test

A

when there is no direction in your hypothesis eg (there is no specified direction for proposed difference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

when do you use a 1 tailed t test

A

when you have a directional hypothesis (eg this pop is greater than this pop)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

what is a type 1 error

A

when the null is true however you reject it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

what is a type 2 error

A

when the null is false however you support it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

how can you control type 1 error

A

critical value (eg alpha = 0.05)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

why are the rejectios regions smaller for 2 tailed t tests

A

probability is always alpha (eg 0.05) so when you have a 2 tailed you half alpha (eg 0.05/2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

why is the assumption of homogeneity important

A

if variances are not equal then the rejections regions will not be comparable across groups this increases type 1 error

to reduce: large sample and balance n

can be fixed by transforming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

what is a residual

A

difference between data point and predicted value (ie mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

why is normaility usually not important

A

central limit therom ensures normality, therefore in a large enough sample it is not necessary
can be fixed post sample by transforming. normality is only important in really skewed/non normal data.

48
Q

what is anova

A

analysis of variance. looking at variation between more than 2 groups and within more than 2 groups

49
Q

why not use a t tests on more than 2 groups

A

increases probability of type 1 error (raises from 0.05 * x amount of tests conducted.) correction can be used but this increases type 2 error and reduces power

50
Q

what is a factor

A

ie treatment/group.

you have factors and levels within factor

51
Q

what is the linear model in descriptive terms

A

mean + effect of factor + noise

52
Q

what is the null testing in an anova

A

that there is no effect ie the levels of a factor dont differ. therefore the MS ratio = 1

53
Q

what is the alt testing in an anova

A

that there is an effect between ie between levels of a factor. the MS ratio is >1

54
Q

are covariates of an anova categorical or continuous. why is this unquie?

A

categorical ie factors. it is a linear model

55
Q

when conducting a test in R for homogeneity, how do you read the output ?

A

the null is that there are no difference (ie they are homogenius (what we want). therefore if NOT significant the variances are not different therefore they are homogenious

56
Q

what is a one way anova?

A

1 factor with mutiple levels (comparing between levels)

57
Q

what is a 2 way anova

A

2 factors with mutiple levels (comparing between fatcors and between levels)

58
Q

when conducting a t test or anova and you get a very small p value what does this mean?

A

this does not test the magnitiude of signfiance just that is it indeed significant. to look at th emagnitiude look at the data not the p value

59
Q

what is a post hoc test?

A

tests conducted after getting a significant p value in an anova to determine which levels are significant (more than 2)

60
Q

how to read output of SNK post hoc test in R

A

look at ranks given to each level to know what one they are comparing. then look at comparisons (ie 2-1) and look if stars to see significant. to see which is bigger look at rank means.

61
Q

what is a correlation

A

testing for a relationship / assoication between 2 random variables

62
Q

what is a regression

A

testing whether a response variable is caused by explanitory variable
ie prediction

63
Q

what is the difference between correlation and regression

A

correlation must be conducted primarily to know there is an association /pattern between variables and test the strength of that relationship. after this is established you can test for prediction (ie one causes the other)

64
Q

are covariates of regression/correlation categorical or continuous.

A

continuous. it is a Linear model

65
Q

what is needed when sampling for correlation/regression

A

each unit in the population has a value for each variable

66
Q

what is the statistic used for a linear correlation

A

p = pearsons. ranges from-1 to +1 (ie perfect negativce or perfect positive). 0 = no relationship

67
Q

what is the r/ cor value

A

correlation r value. not the same as r2. measures the strentgh of the relationship -1 to +1. 0= no relationship.

68
Q

besides random sampling and independence what are the other assumptions for linear correlation

A

normall distributed variables and relationships between variables are linear

69
Q

if a regression is shown between 2 variables, what are your limits with prediction

A

you can predict new values of Y ( response) from new values of X (explanitory) however this is ONLY within your sampling range and you cant go beyond that.

70
Q

in the linear model, what is the slope

A

the relationship. if slope is at 0 there is no relationship.

71
Q

what is the pearsons r2

A

testing the precision of prediction. how much of the variation in y is explained by x. (closer to 1 = stronger). if below 0.5 = a lot of variance is not explained pruly by relationship with x

72
Q

what is the null for a regression

A

slope. either slope is at 0 (no relationship) or it is directional (opposite to your alternative).

73
Q

what are the steps to completeing a regression test

A
  1. plot a scatter plot to see linear relationship
  2. perform anova and look at p value
  3. if positive look at r sqaured to test strength
74
Q

what is a big assumption for linear regression

A

fixed X.
measured without error. as all measurement has error however, the error must be lower than the measurement. ie measurement of cm with mm error okay )

75
Q

what is the difference between correlation and casuation

A

correlation is an association between 2 variables, just because there is a relationship it doesnt mean one causes the other.

casuation means one variable is caused by the other.

76
Q

how do you determine causality

A

you can only disprove nulls therefore you can never prove casuality
however to infer casuality you need to perform maipluatlve experiments

77
Q

what is a scale consideration when creating experiments

A

you usually work with smaller scales, will the same relationship be found at larger scales?

78
Q

what is a procedural control

A

another level of a factor to account for experiment artefacts, ie an effect you created with your experiment that perviously wasnt in the system (confounding). you need a treatment, a control and a proceedural control

79
Q

categorical covariates aka explanitory variables can be defined as..

A

factors:
fixed vs random
crossed vs nested

80
Q

can you have interaction witha 1 way anova

A

no. interactions mean that levels of one factor is dependent on levels of another factor. you can only have interactions in a 2 or more way anova

81
Q

what is a nested design, give an example

A

levels factor 1 are nested within levels of factor 2
a common one is location:
ie. treatments for factor 1 are sperated between sites

82
Q

what is a crossed design

A

all levels of factor 1 are present within the other factor

ie all treatments are present within each site

83
Q

in crossed or nested can you find interactions

A

only in corssed designs can you find interactions

84
Q

what is an interaction

A

levels of one factor are dependent on the levels of another factor
ie whether the levels of factor one are significant will depend on which site (level) of factor 2 they are in. this means there is inconsistancy thorugh space.

85
Q

in an anova table how do you know if there is an interaction?

A

look at the bottom

factor1:factor2 and the pvalue

86
Q

what do you do if there is an interaction

A

post hoc tests. eg SNK test will look at each factor2 and the levels of factor 1 wihtin this factor 2. eg each site and the levels with sites and if they are significant then look at means

87
Q

say you dont have homogenous variances but you perform a test anyway when is this an issue ?

A

if you dont get a significant effect then there is no issue however if you get a significant effect then need to transform and test again because nonhomogeneity increases type 1 error (say there is when there isnt)

88
Q

will you need to do post hoc if you do a 1 factor anova with no interaction?

A

yes, if there are more than 2 levels to know which levels are different.

89
Q

what happens if there is no significant interaction

A

look at main effects ie your response variable to look at differences betwene levels

90
Q

what is a mixed model

A

a combination of fixed factors and random factors

91
Q

what is a random factor

A

fixed: treatment, specific.
random: general, represenative

example:
fixed sites you can about each site
random sites you test for consistancy spaitally

92
Q

why is the difference between fixed and random factors important

A

it will change how the mean square is estimated.
fixed cares about means between levels. whereas random cares only about variability between sites
same for the null: are you looking at means (fixed) or variance (random)

93
Q

the extend of the interenced for fixed vs random factors

A

you cannot extrabolate/generalise for fixed factors. what you get from your experiment is speciic to your factors.
for random, it is more general and inferences can be applied to other spp/sites etc.

94
Q

why have mixed models?

A

avoid confounding (spaital or temporal), avoid non independence, test for consistency

95
Q

can you pool random sites togther to increase sample sizes?

A

no. this increases type 1 error

96
Q

are random factors continuous or categorical

A

always categorical

97
Q

are chi tests for categorical or continuous variables?

A

categorical

98
Q

what are the two ways chi tests can be used

A
  1. goofness of fit, whether the sample matches exptected population
  2. contingency or assoication test. test for independence
99
Q

how to calculate the degrees of freedom for a chi test

A

(rows - 1)*(columns -1)

100
Q

how to you generate expected cell counts for chi tests

A

using the null hypothesis and the total samples

101
Q

what is the chi test statistic formula

A

[sum of] (observed - expected)2/expected

102
Q

what is a key assumption of the chi test goofness of fit

A

no more than 20% of expected freuqnecies are smaller than 5. there can be transformations / pooling if so

103
Q

how is a test for independence similar and different from a correlation

A

both testing for associations between random variables however chi is categorical while correlation is continuous

104
Q

what is the null for a chi 2 independence test

A

no association

105
Q

are there post hoc tests for chi 2

A

no, only way to know is to plot the data on graphs

106
Q

what are parametric tests

A

make assumptions about the parameters ie means, variance between groups/treatments of a populations distrubution

eg t test ANOVA linear regression

107
Q

what are non parameteric tests

A

distribution free, not estimating parameters ie rank based tests

108
Q

what is a rank based non parameteric test

A

no assumption about underlying disturbition therefore good when you have very non normal data or big outliers you want to keep or the responses are already ranks

109
Q

assumptions of non parameteric tests

A

independence between samples and homogeneity of variance

110
Q

how to perform a non parameteric rank test

A

rank all observations ignoring groups from low to high

randomise ranks to develop propability distribution. use real data to see if it fits distrubution

111
Q

what is better a parametric test or non parameteric test

A

always use parametric if you can, it is more powerful. try and trnaform data is non normal first. best to use non parameteric test if your data is already ranks

112
Q

what is the mann whitney wilcoxon test

A

non parameteric test to comapre 1 factor with 2 levels (similar to a t test )

113
Q

when to use kruskal wallis test

A

and extension of MWW factor with mroe than 2 levels (similar to an anova)

114
Q

when to use spearman rank correlation

A

for non linear correlations, continuous

115
Q

when ranking data what do you do with tied observations with the same value

A

average of the ranks

116
Q

what is rho

A

rnak coefficent for spearmans correlation. a measure of strength of the relationship -1 to 1.

117
Q

to get independent data when sampling. what is the best method

A

randomly take 1 sample per treatment in the same block

or take all samples for treatment in the same block and use the mean only

cannot use all samples as that is pseudo replication