ED+A Flashcards

Question

Standard deviation

Answer 1

(s) Average size of deviates by square rooting variance we get a measure of variation unaffected by sample size. Standard deviation of 2.5 means the average data point is 2.5 times larger or smaller than the mean.

Answer 2

All individuals in a group

Answer 3

Sub-set of population normally chosen to represent the population.

Answer 4

'Bell curve' or Gaussian distribution. Continuous - useful, symmetrical, 68.5% of all data points in normal population will be within one standard deviation of mean.

Answer 5

Measure of confidence in sample mean as an estimate of real population mean. Small SEM = good estimate. If error bars do not overlap sample means are different. SEM decreases with sample size as more data means more confidence. = Standard deviation of a population of sample means.95% confidence interval.

Answer 6

Skewed to right = long tail to the distribution on the right and no symmetry (mode closer to left than right) = Not normal.

Answer 7

make several key assumptions about distibution. e.g it is normal.

Answer 8

fewer assumptions about data.

Answer 9

Common for discrete data if maximum possible count is larger than mean. Many different shapes, if mean is near to zero = heavily skewed normal distribution, mean is big = normal distribution

Answer 10

good for discrete data where maximum possible count is close to the mean.

Answer 11

type of graph used for visualising differences between samples.

Answer 12

Graph usually used for visualising trends between variables.

Answer 13

Can obtain probability (p) that data is consistent with null hypothesis. p is small = high chance that effect is biologically meaningful/statistically significant if less than 0.05. Threshold decided before test.

Answer 14

(t) Tests for a difference between means of two independent samples of continuous data. Are the samples from the same population with a single mean? Null is no difference between means, if true then t=0. If far from zero either way then not due to random choice and reject null so there is a difference between means. PARAMETRIC

Answer 15

(df) modified sample size

Answer 16

test on specific null hypothesis 'there is not a negative/positive relationship between A and B'. Interested in only positive or only negative deviations of statistic. p-value associated with a particular t value is halved.

Answer 17

Test on general hypothesis 'there is no relationship between A and B' Interested in both positive and negative deviations of the statistic from its expected distribution.

Answer 18

Rejection of null hypothesis when it is in fact true. Probability of making is easy to calculate, if p-value is less than 0.05 and we reject then there is a 5% chance it is in fact true.

Answer 19

Failure to reject null hypothesis when it is in fact false. Harder to estimate, influenced by design of experiment, sample size and statistical test we choose.

Answer 20

Data points are independent if they have nothing special in common except for the treatment or variable of interest.

Answer 21

not being able to tell if any observed difference between two groups was a result of the treatment or if it was caused by the confounding variables.

Answer 22

repeated observations made on the same subjects. Not independent.

Answer 23

Use of non-independent data points as if they were actually independent e.g replicates that are from the same animal when seeing a difference between treatments.

Answer 24

'before and after studies' collection of two samples that are not independent of each other. Average change in a variable caused by treatment, look at effect we are examining rather than variation between different individuals.

Answer 25

variance in each sample in the test is the same.

Answer 26

when using parametric tests often assume data is normal. if not simple transformants e.g taking square root or log of data can solve it.

Answer 27

T test analysing two samples of data in a pair (e.g before and after) Not independent.

Answer 28

Test for homogeneity of variance. Null hypothesis is variances of samples are the same.

Answer 29

Test for normality, null is that the data are normally distributed. Significant p value means reject meaning data is not normal.

Answer 30

also known as Mann-Whitney U test, Non parametric equivalent of independent samples t-test. Examines difference between two samples of ranked data. Significant p value means reject null (two samples come from a single population with a single mean rank. samples are independent.

Answer 31

If variances of samples are significantly different (levenes test) assumption of independent samples t-test are not met. Data must be normal, small tweak to degrees of freedom. difference between means of two independent samples of continuous data

Answer 32

Non-parametric equivalent of paired t-test. Examines difference between two samples of ranked data (Two-sample wilcoxon) but it doesnt assume samples are independent - asssumes paired.

Answer 33

(X^2) Examines differences between observed and expected counts or frequencies. Ask whether frequencies of individual observations made in two or more categories are significantly different from frequencies we would expect if null hypothesis is true. Quantify the deviation of observed frequencies from expected frequencies. Probability of finding a value of chi-squared at least as large as our observed value if null is true

Answer 34

test when 2 sets of categories simultaneously. Quantify the deviation of observed frequencies from the expected frequencies.

Answer 35

Table of conserved counts or frequencies in a number of categories (male female, juvenile adult)

Answer 36

Relationship between two variables. +ve = both increase

Answer 37

trend or relationship between two variables where changes in one causes change in the other.

Answer 38

Changes in one variable coincide with changes in the other but causalty is not understood or important. CAn be a result of causal relationship but will aslo be generated when they share a common cause.

Answer 39

Variables that correlate

Answer 40

Parametric test used to test the significance of correlations between two variables. Must be normally distributed and relationship must be linear.

Answer 41

(rho) non parametric statistic used to test significance of correlations between variables. Can be used when linearity and normality are violated.

Answer 42

Use of certain statistics to test large numbers of possible relationships between variables in the absence of specific hypotheses formulated in advance. Spots patterns and generating new hypotheses, not used to assess nulls before data is collected.

Answer 43

analysis of variance. Tests for differences between groups or samples and caused by changes in more than one variable (factor), each difference value that each variable can take is a level. Tests one null.

Answer 44

tests more than one null hypothesis simultaneously

Answer 45

statistic used to test null in ANOVA. Compares relative amounts of variation among (between) groups and within groups of sum of squares. Large value means large variation among compared to within and therefore our samples are likely to be significant. Can calculate a p-value for a particular value of F which tells us the probability of getting as much among-group variations we have observed if our null hypothesis is actually true.

Answer 46

Mean of all data points in groups in ANOVA.

Answer 47

Mean of the data points in an individual group/sample in ANOVA. larger variation between groups means larger f becomes more likely to reject the null.

Answer 48

(SSamong) total amount of variation among groups. add squared differences between each data point and the relevant grand mean.

Answer 49

(SSwithin) total amount of variation within groups, add up squared differences between each data point and relevant group mean

Answer 50

(MSamong) average size of difference between group means and grand mean.

Answer 51

(MSwithin) average size of difference between data points and relevant group mean

Answer 52

results of ANOVA normally presented in table showing among and within groups sums of squares, mean squares, df and F and p.

Answer 53

operate like simple t-tests telling us whether individual pairs of samples/levels are different. Chance of type I increases. E.g ANOVA proves difference between 3 data sets, post hoc decides how different they are from each other.

Answer 54

Difference between the prediction of your statistical model and an individual. In the contest of regression - residual is the distance along y axis between an individual data point and the line of best fit. Variation in y which is not explained by variation in x.

Answer 55

Non parametric equivalent of one-way ANOVA. Tests null hypothesis that there is no difference between mean ranks of two or more groups/samples. No assumptions. One factor at a time.

Answer 56

interaction between two factors in ANOVA occurs when the effect of one factor on the response variable are influenced by another factor.

Answer 57

analyse datasets which include some replicates which are not independent

Answer 58

special form of nested ANOVA where non independent replicate data points are recorded at different times from the same individual.

Answer 59

parametirc test analysing relationships or trends where the pattern of cause and effect is known to exist or is of interest. Tests for effect of changes in one variable (independent = x) on changes in second variable (dependent = y). Tests the null that there is no relationship between changes in x and y. F is measure of amount of variation in y which is explaoned by variation in x large = small p. both are continuous and linear relationship, residuals are normally distributed.

Answer 60

Line which represents the most plausible alternative hypothesis (the most plausible relationship between x and y)

Answer 61

more intuitive measure of the strength of the relationship we are studying/effect size. Proportion of the total amount of variation in y which is explained by variation in x. SSregression/SStotal.

Answer 62

Line of best fit. y=mx+c predicts values of y for a particular value of x but only within the range of x values available.

Answer 63

analysis of covariance combines ANOVA and linear regression. test effects of a mixture of continuous and discrete independent variables in a continuous response variable.

Answer 64

Describes a continuous independent variable in situations where there is a mixture of continuous and discrete independent variables.

Answer 65

Sums of squares used in linear regression allow us to quantify the amount of variation in y which is explained by variation in x. SStotal-SSresidual=SSregression (calculate the amount of variability in y that is actually explained by changes in x)

ED+A Flashcards

(89 cards)