Part 1 Flashcards

1
Q

Observation –> … –> … –> …

A

question, hypothesis, prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Observation: Gammarus occurs almost entirely under stones (rather than open streams)

Question: … … Gammarus spend most of its time under stones?

A

why does

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Hypothesis - an … proposed to account for observed facts - there is often more than one hypothesis generated
e.g.

Gammarus occurs under stones because:

  • need to shelter from current
  • their food gets trapped and accumulates under stones
  • they are subject to predation by visually hunting fish and need to remain out of sight
A

explanation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Predictions - what you would … … … if the hypothesis was true - should be testable and ideally unique to hypothesis it is based on

e.g. shelter hypothesis - a greater proportion of gammarus should be found in the open in streams with slow flow (or slower flowing areas of a stream)

predation hypothesis - gammarus should aggregate under stones more in streams where fish are present than where they are not

A

expect to see

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Hypotheses are … or not …, but rarely …

A

rejected, rejected, proved

  • just bc one hypothesis is supported doesn’t mean there isn’t another underlying explanation - can’t think of all possible hypotheses - with the right evidence we can be sure that hypotheses cannot be true
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Cycle of proposing hypotheses and then seeking evidence potentially capable of falsifying them is the scientific process often termed …

A

falsificationism

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

A variable is…

A

any characteristic that can be measured or experimentally controlled on different items or objects

  • numeric or non-numeric (e.g. colour)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A set of related variables is known as a … …

A

data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Numeric variables can be categorised as belonging to … or … scales

A

interval, ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Categorical variables can be characterised as … or …

A

nominal, ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Nominal variables…

A

arise when observations are recorded as categories that have no natural ordering relative to one another, e.g. marital status, sex, colour morph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Ordinal variables…

A

occur when observations can be assigned some meaningful order, but where the exact ‘distance’ between items is not fixed, or even known, e.g. degree of aggressiveness sorted into the categories: initiates attack (3), aggressive display (2), ignores (1), retreats (0).

Rank orderings are also a type of ordinal data (e.g. place in a race - 1st 2nd 3rd etc.)

  • can say something about relationship between categories: larger score = more aggressive response, greater score = slower runner. But cannot say aggressiveness score of 2 is twice as aggressive as a score of 1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Interval scale variables take values on a … numerical scale, but where the scale starts at an … point. e.g. … on a … scale but not on a … scale

A

consistent, arbitrary, temperature, celsius, Kelvin

  • can say difference between 60 and 70 degrees C is the same as that between -20 and -10, but cannot say 60 degrees C is double the temperature of 30 degrees C
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Ratio scale variables have a true … and a known consistent mathematical relationship between any points on the measurement scale, e.g. … scale for temperature

A

zero, kelvin

  • on Kelvin scale 60K is double the temperature of 30K
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Can meaningfully … or … with interval scales, but cannot meaningfully …, as you can with ratio scales

A

add, subtract, multiply

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In general … variables are the best suited to statistical analysis

A

ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Accuracy is…

A

how close a measurement is to the true value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Precision is…

A

how repeatable a measure is, irrespective of whether it is close to the true value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

The number of … … we use suggests something about the precision of the result. A value of 12.4 actually measured with the same precision as 12.735 should properly be written …

A

significant figures, 12.400

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Usually the worst form of error is …, a … lack of accuracy

A

bias, systematic (the data are not just inaccurate but all tend to deviate from the true measurements in the same direction)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

E.g.s of bias:

  • …-… sampling
  • … of biological material
  • … by the process of investigation (e.g. adrenaline increased by process of sampling adrenaline in blood)
  • … bias
A

non-random (selective sampling techniques), conditioning, interference, investigator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does a population mean in statistics?

A

Any group of items that share certain attributes or properties

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

The goal of statistics is to learn something about … by … data collected from them

A

populations, analysing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Statistical populations are defined by the …

A

investigator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is a population parameter?

A

A numeric quantity that describes a particular aspect of the variables in the populations (describes a feature of the distribution of variables in the population) - e.g. population mean, variance, correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

The sample chosen must be as … as possible of the whole population

A

representative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

A point estimate is useless on its own, as estimates are always derived from a … … of the wider population. They must be accompanied by a value of ….

A

limited sample, uncertainty

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

The chance variation that arises in different estimates using different random samples is known as … …

A

sampling error (or sampling variation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

The sampling distribution is the the distribution we expect a particular estimate to follow

A

yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

sample size is often denoted as “…”

A

n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Sampling error is … as sample size is …

A

reduced, increased

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

The standard error of an estimate is the … … of its … …

A

standard deviation, sampling distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

R doesn’t like …

A

percentages (use decimals e.g. 0.4 to represent 40%)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

… statistics works by asking “what would have happened if we were to repeat an experiment or collection exercise many times, assuming that the … remains the same each time”

A

Frequentist, population

then working out how likely a particular result is based on the distribution of data

35
Q

The two most important ideas in frequentist statistics are …-… and … …

A

p-values, statistical significance

36
Q

Sampling with replacement: each artificial sample is called a … …

A

bootstrapped sample

37
Q

If a probability (p) value is less than the chosen … … we say the result is said to be statistically significant

A

significance level

38
Q

The process of assigning random labels is called …

A

permutation

39
Q

The p-value is the … of obtaining a test statistic equal to or ‘more extreme’ than the … value, assuming the … hypothesis is true

A

probability, estimated, null

40
Q

All frequentist statistical tests work by specifying a … … and then evaluating the observed data to see if they … from the … … in a way that is inconsistent with … variation

A

null hypothesis, deviate, null hypothesis, sampling

41
Q

H0 is the … hypothesis and H1 is the … (or …) hypothesis

A

null, test, alternative

42
Q

The alternative hypothesis is essentially a statement of the effect we are … … …

A

expecting to see (e.g. purple and green plants differ in their mean size)

43
Q

… the null hypothesis is not … the alternative hypothesis

A

rejecting, proving

44
Q

Large p value means observed result is quite likely if the null hypothesis is …

A

true (i.e. due to sampling variation)

- cannot reject null hypothesis (not the same as accepting the null hypothesis is true)

45
Q

Do not confuse … significance with … significance

A

statistical, biological

  • a result may be statistically significant but biologically trivial, e.g. pH in open water (7.1) vs in beds of submerged vegetation (6.9) is statistically significant but a very small effect and almost certainly of no importance to all the invertebrates.
46
Q

The significance of a result depends on a combination of three things:

  1. The size of the true effect in the …
  2. The … of the data
  3. The … size
A

population, variability, sample

47
Q

We must always evaluate the … of an analysis to determine whether or not we trust it

A

assumptions

48
Q

In conceptual terms, the statistical models we use describe data in terms of a … component and a … component

A

systematic, random

observed data = systematic component + random component

49
Q

The normal distribution is completely described by its … (a measure of “central …”) and its … … (a measure of dispersion)

A

mean, tendency, standard deviation

50
Q

If a variable is normally distributed, then about … of its values will fall inside an interval that is … standard deviations wide

A

95%, four

51
Q

The variable name on the left of the ~ must be the variable whose…

A

mean we want to compare.

The variable on the right must be the indicator variable that says which group each observation belongs to.

52
Q

Correlations are statistical measures that quantify an … between two … variables

A

association, numeric

two sample t test - numeric btw categorical variables

53
Q

A correlation quantifies, via a … …, the degree to which. an association tends to a certain pattern

A

correlation coefficient

54
Q

If there is no relationship between the variables, the correlation coefficient will be …. The closer to … the value, the weaker the relationship. A perfect correlation will be either … or …, depending on the direction.

A

zero, zero, +1, -1

55
Q

A regression (not a correlation) allows us to make…

A

predictions about the value of one variable from the value of a second variable

  • as a line is fitted through the data
56
Q

A simple linear regression allows us to predict how one variable (… …) responds to another (… …), using a straight-line relationship

A

response variable, predictor variable

57
Q

How do we find line of best fit?

A

Line with lowest residual sum of squares

residuals are vertical distance from line of best fit

58
Q

Response variable on … axis, predictor variable on … axis

A

y, x

59
Q

Regression model: … variable on the left of the ~, … variable on the right

A

response, predictor

60
Q

Larger F values indicate a stronger relationship between…

A

x and y

61
Q

ANOVA:
- Measure total variation using sum of squares of deviations from the … …, … variation (within group variation = sum of squares of deviations from individual group means), and between-group variation (sum of squares of deviation of … from the … …)

  • Convert to measures of variability that don’t scale with sample size and number of groups (using … … …) - each of 3 sums of squares has different d.f. value
  • total d.f, treatment d.f., error d.f.

Then calculate mean square = sum of squares/ degrees of freedom

A

grand mean, residual, means, grand mean, degrees of freedom

62
Q

Squaring negative deviations lead to…

A

a positive number

63
Q

The important message is that ANOVA works by making just one comparison: the … variation and the … variation

A

treatment, error

64
Q

One-way anova does not require … …

A

equal replication - it will work even where sample sizes differ between treatments

65
Q

An experimental factor is a controlled variable whose levels are…

A

set by the experimenter

66
Q

Anova p-value of lower than 0.05 suggests that…

A

at least one of the treatments is having an effect - global test of significance as it doesn’t tell us anything about which means are different

67
Q

Find standard error stuff in…

A

one-way anova section

68
Q

Left skew - … data

Right skew - … data

A

square, log

69
Q

Independence: value of measurement from one object is not…

A

affected by the values of other objects

70
Q

Pseudoreplication is an … increase in the … … (and hence d.f.) caused by using …-… data

A

artificial, sample size, non-independent

71
Q

To carry out a t-test on paired data we have to:

  1. Find the mean … of all the pairs
  2. evaluate whether this is significantly different from ….

This is actually an application of the …-… …-…

A

difference, zero, one-sample t-test

72
Q

In paired t-tests there is no need for the original data to be drawn from a … …. It is the differences between pairs that do

A

normal distribution

73
Q

What does RCBD stand for?

A

Randomised Complete Block Design - each block sees each treatment exactly once

74
Q

… what you can; … what you cannot

A

block, randomise

75
Q

The only thing that distinguishes ANOVA and regressions is the..

A

type of predictor variable they accommodate (categorical vs numerical)

76
Q
ANCOVA:
 residuals generated for:
1. Separate means vs grand mean
2. Common slope vs separate means
3. Separate slopes vs common slope (interaction)
A

yes

77
Q

The word “treatment” should be used for … rather than … studies

A

experimental, observational

78
Q

chi-squared must be carried out on the actual … not … or …, or the … of data

A

counts, percentages, proportions, means

79
Q

A non-parametric test is just a catch-all term that applies to any test which doesn’t assume the data are…

A

drawn from a specific distribution

80
Q

Chi-squared tests are …-…

A

non-parametric - as they make weak assumptions about the frequency data

81
Q

non-parametric test calculations are done using the … … of the data

A

rank order

82
Q

Paired t-test:

Distribution of … does not need to be normal! Only distribution of … does!

A

samples, differences

if differences not normally distributed - can use wilcoxon test

83
Q

Mann_Whitney U null hypothesis: … are the same

A

medians (looking for differing central tendency)

  • significant p-value means medians are likely to be different