ED+A Flashcards

1
Q

Hypothetico-deductive reasoning

A

Hypothesis created not by induction, experiments used to falsify hypothesis. Can be ‘good’ hypothesis however arguably there is no way of proving it to be true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Hypothesis

A

Preposition tentatively put forward to explain an observation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Alternate Hypothesis

A

(H1) Hypothesis makes a specific prediction about results which can later be tested.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Theory

A

Set of general ideas or rules to explain a group of observations. More general than hypothesis, less speculative.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Paradigm

A

Describes a whole way of thinking or a particular way of viewing the world.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Paradigm shift

A

Dramatic change in the way we think about a subject when evidence has accumulated in favour of rejecting a previous set of hypotheses or theories, or a creative genius moment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Null Hypothesis

A

(H0) Form of hypothesis we test using statistics following an observation. Predicts NOTHING will happen/No effect/No difference or relationship. Hope to reject is data supports alternate hypothesis. Only one null.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Statistics

A

Branch of mathematics scientists use for an objective assessment of patterns in data from experiments or observations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Nominal data

A

CATEGORICAL In the form of categories with names (e.g male or female). Non-quantitative.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Discrete data

A

QUANTITATIVE Count how many individuals in each group of Nominal data. Quantitative and always in the form of whole numbers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Ordinal data

A

CATEGORICAL Ranked in order of size or on a rating scale (e.g 1st, 2nd). Not quantitative as we do not know the difference between 1st and 2nd, only that 1st is larger (e.g strongly agree, disagree)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Continuous data

A

QUANTITATIVE (e.g temperature, time) Subjective decision between continuous and discrete.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Descriptive statistics

A

Measures calculated from a data-set which summarise some characteristic of the data (central tendancy or variability)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Sample size

A

(n) number of individuals sampled.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Frequency

A

Number of times something occurs, or a count of the number of items in a category.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Mean

A

A measure of central tendency. Average of a sample of numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Median

A

A measure of central tendency. Middle number in a sample of numbers when placed in order. If sample is even then the average of the two middle numbers is taken.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Mode

A

A measure of central tendency. The most common number.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Measures of central tendency

A

Mean, Median, Mode - all tell about the position of the middle of the sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Frequency histogram

A

Graph showing the frequency of quantitative observations in each category.
Discrete - categories represent each possible total count made. Continuous - categories are arbitrary (1.-0, 11-20) you decide.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Distribution

A

Shape of data set as seen on frequency histogram. Described by mathematical equations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Deviate

A

Distance between a data point/observations and the mean. Also known as residual.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Sum of Squares

A

(SS) total of all the squared deviates for a particular data-set. Gets rid of minus signs, quantifies the magnitude of the total variability but ignores direction of variability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Variance

A

(S^2) Average size of the deviates. Measure of variability. Sample variance is an estimate of population variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Standard deviation

A

(s) Average size of deviates by square rooting variance we get a measure of variation unaffected by sample size. Standard deviation of 2.5 means the average data point is 2.5 times larger or smaller than the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Population

A

All individuals in a group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Sample

A

Sub-set of population normally chosen to represent the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Normal distribution

A

‘Bell curve’ or Gaussian distribution. Continuous - useful, symmetrical, 68.5% of all data points in normal population will be within one standard deviation of mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Standard Error of the mean

A

Measure of confidence in sample mean as an estimate of real population mean. Small SEM = good estimate. If error bars do not overlap sample means are different. SEM decreases with sample size as more data means more confidence. = Standard deviation of a population of sample means.95% confidence interval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Skew

A

Skewed to right = long tail to the distribution on the right and no symmetry (mode closer to left than right) = Not normal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Parametric statistics/statistical tests

A

make several key assumptions about distibution. e.g it is normal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Non parametric statistics/test

A

fewer assumptions about data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Poisson distribution

A

Common for discrete data if maximum possible count is larger than mean. Many different shapes, if mean is near to zero = heavily skewed normal distribution, mean is big = normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Binomial distribution

A

good for discrete data where maximum possible count is close to the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Bar chart

A

type of graph used for visualising differences between samples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Scatter graph

A

Graph usually used for visualising trends between variables.

37
Q

Statistical significance

A

Can obtain probability (p) that data is consistent with null hypothesis. p is small = high chance that effect is biologically meaningful/statistically significant if less than 0.05. Threshold decided before test.

38
Q

Independent samples T-test

A

(t) Tests for a difference between means of two independent samples of continuous data. Are the samples from the same population with a single mean? Null is no difference between means, if true then t=0. If far from zero either way then not due to random choice and reject null so there is a difference between means. PARAMETRIC

39
Q

Degrees of freedom

A

(df) modified sample size

40
Q

One tailed test

A

test on specific null hypothesis ‘there is not a negative/positive relationship between A and B’. Interested in only positive or only negative deviations of statistic. p-value associated with a particular t value is halved.

41
Q

Two tailed test

A

Test on general hypothesis ‘there is no relationship between A and B’ Interested in both positive and negative deviations of the statistic from its expected distribution.

42
Q

Type I error

A

Rejection of null hypothesis when it is in fact true. Probability of making is easy to calculate, if p-value is less than 0.05 and we reject then there is a 5% chance it is in fact true.

43
Q

Type II error

A

Failure to reject null hypothesis when it is in fact false. Harder to estimate, influenced by design of experiment, sample size and statistical test we choose.

44
Q

Independence

A

Data points are independent if they have nothing special in common except for the treatment or variable of interest.

45
Q

Confound

A

not being able to tell if any observed difference between two groups was a result of the treatment or if it was caused by the confounding variables.

46
Q

Repeated measures

A

repeated observations made on the same subjects. Not independent.

47
Q

Pseudoreplication

A

Use of non-independent data points as if they were actually independent e.g replicates that are from the same animal when seeing a difference between treatments.

48
Q

Paired design

A

‘before and after studies’ collection of two samples that are not independent of each other. Average change in a variable caused by treatment, look at effect we are examining rather than variation between different individuals.

49
Q

Homogeneity of variance

A

variance in each sample in the test is the same.

50
Q

Transformation

A

when using parametric tests often assume data is normal. if not simple transformants e.g taking square root or log of data can solve it.

51
Q

Paired t-test

A

T test analysing two samples of data in a pair (e.g before and after) Not independent.

52
Q

Levene’s test

A

Test for homogeneity of variance. Null hypothesis is variances of samples are the same.

53
Q

Shapiro-Wilk test

A

Test for normality, null is that the data are normally distributed. Significant p value means reject meaning data is not normal.

54
Q

Two-sample Wilcoxon test

A

also known as Mann-Whitney U test, Non parametric equivalent of independent samples t-test. Examines difference between two samples of ranked data. Significant p value means reject null (two samples come from a single population with a single mean rank. samples are independent.

55
Q

Welch two-sample t-test

A

If variances of samples are significantly different (levenes test) assumption of independent samples t-test are not met. Data must be normal, small tweak to degrees of freedom. difference between means of two independent samples of continuous data

56
Q

Paired-samples wilcoxon test

A

Non-parametric equivalent of paired t-test. Examines difference between two samples of ranked data (Two-sample wilcoxon) but it doesnt assume samples are independent - asssumes paired.

57
Q

Chi-squared test

A

(X^2) Examines differences between observed and expected counts or frequencies. Ask whether frequencies of individual observations made in two or more categories are significantly different from frequencies we would expect if null hypothesis is true. Quantify the deviation of observed frequencies from expected frequencies. Probability of finding a value of chi-squared at least as large as our observed value if null is true

58
Q

Two way Chi-squared test

A

test when 2 sets of categories simultaneously. Quantify the deviation of observed frequencies from the expected frequencies.

59
Q

Contingency table

A

Table of conserved counts or frequencies in a number of categories (male female, juvenile adult)

60
Q

Trend

A

Relationship between two variables. +ve = both increase

61
Q

Causal relationship

A

trend or relationship between two variables where changes in one causes change in the other.

62
Q

Correlation

A

Changes in one variable coincide with changes in the other but causalty is not understood or important. CAn be a result of causal relationship but will aslo be generated when they share a common cause.

63
Q

Covary

A

Variables that correlate

64
Q

Pearsons correlation coefficient

A

Parametric test used to test the significance of correlations between two variables. Must be normally distributed and relationship must be linear.

65
Q

Spearman’s rank correlation coefficient

A

(rho) non parametric statistic used to test significance of correlations between variables. Can be used when linearity and normality are violated.

66
Q

Data Dredging

A

Use of certain statistics to test large numbers of possible relationships between variables in the absence of specific hypotheses formulated in advance. Spots patterns and generating new hypotheses, not used to assess nulls before data is collected.

67
Q

ANOVA

A

analysis of variance. Tests for differences between groups or samples and caused by changes in more than one variable (factor), each difference value that each variable can take is a level. Tests one null.

68
Q

Multi-way ANOVA

A

tests more than one null hypothesis simultaneously

69
Q

F-ratio

A

statistic used to test null in ANOVA. Compares relative amounts of variation among (between) groups and within groups of sum of squares. Large value means large variation among compared to within and therefore our samples are likely to be significant. Can calculate a p-value for a particular value of F which tells us the probability of getting as much among-group variations we have observed if our null hypothesis is actually true.

70
Q

Grand mean

A

Mean of all data points in groups in ANOVA.

71
Q

Group mean

A

Mean of the data points in an individual group/sample in ANOVA. larger variation between groups means larger f becomes more likely to reject the null.

72
Q

Among group sum of squares

A

(SSamong) total amount of variation among groups. add squared differences between each data point and the relevant grand mean.

73
Q

Within-group sum of squares

A

(SSwithin) total amount of variation within groups, add up squared differences between each data point and relevant group mean

74
Q

Among group mean square

A

(MSamong) average size of difference between group means and grand mean.

75
Q

Within group mean square

A

(MSwithin) average size of difference between data points and relevant group mean

76
Q

ANOVA table

A

results of ANOVA normally presented in table showing among and within groups sums of squares, mean squares, df and F and p.

77
Q

Post-hoc tests

A

operate like simple t-tests telling us whether individual pairs of samples/levels are different. Chance of type I increases. E.g ANOVA proves difference between 3 data sets, post hoc decides how different they are from each other.

78
Q

Residual

A

Difference between the prediction of your statistical model and an individual. In the contest of regression - residual is the distance along y axis between an individual data point and the line of best fit. Variation in y which is not explained by variation in x.

79
Q

Kruskal-Wallis test

A

Non parametric equivalent of one-way ANOVA. Tests null hypothesis that there is no difference between mean ranks of two or more groups/samples. No assumptions. One factor at a time.

80
Q

Interaction

A

interaction between two factors in ANOVA occurs when the effect of one factor on the response variable are influenced by another factor.

81
Q

Nested ANOVA

A

analyse datasets which include some replicates which are not independent

82
Q

Repeared-measures ANOVA

A

special form of nested ANOVA where non independent replicate data points are recorded at different times from the same individual.

83
Q

Linear regression

A

parametirc test analysing relationships or trends where the pattern of cause and effect is known to exist or is of interest. Tests for effect of changes in one variable (independent = x) on changes in second variable (dependent = y). Tests the null that there is no relationship between changes in x and y. F is measure of amount of variation in y which is explaoned by variation in x large = small p. both are continuous and linear relationship, residuals are normally distributed.

84
Q

Line of best fit

A

Line which represents the most plausible alternative hypothesis (the most plausible relationship between x and y)

85
Q

r^2

A

more intuitive measure of the strength of the relationship we are studying/effect size. Proportion of the total amount of variation in y which is explained by variation in x. SSregression/SStotal.

86
Q

Regression equation

A

Line of best fit. y=mx+c predicts values of y for a particular value of x but only within the range of x values available.

87
Q

ANCOVA

A

analysis of covariance combines ANOVA and linear regression. test effects of a mixture of continuous and discrete independent variables in a continuous response variable.

88
Q

Covariate

A

Describes a continuous independent variable in situations where there is a mixture of continuous and discrete independent variables.

89
Q

SStotal, SSresidual, SSregression

A

Sums of squares used in linear regression allow us to quantify the amount of variation in y which is explained by variation in x. SStotal-SSresidual=SSregression (calculate the amount of variability in y that is actually explained by changes in x)