Differences, Frequencies and Relationships Flashcards
Why cant t-tests be used for testing between more than 2 groups?
Very time consuming as 4 groups would require 6 t-tests
Multiple testing increases our chance of making type 1 errors
What is ANOVA?
Analysis of Variance
When do we use ANOVA?
Used to determine differences between more than 2 groups
What does ANOVA look at?
Looks at the variability of the data rather than directly at the means
What are the assumptions of ANOVA?
Continuous data within each group
Equal variance in each group
Samples are independent
Need 3 or more groups to test
What happens if the assumptions of the ANOVA are not held/met?
P value may be wrong
What should you do if the assumptions of the ANOVA are not met?
Try transforming the data
Use a non-parametric equivalent test
If samples are not independent then need another method
What does the one way ANOVA do?
Compares the means from 3 or more independent samples giving overall P values
Compares the test statistic to an F distribution
What are the null hypothesis for a one way ANOVA?
The samples for each group come from populations with the same mean values.
How does the one way ANOVA separate the variability?
Separates total variability in the data into:
Between group variance (treatment factor; differences between individuals from the different groups)
Within group variance (unexplained residual error; random variation between individuals within each group)
For a one way ANOVA, what would it mean if there are differences between the groups?
Then the between group variance will be larger than the within group variance.
How does the one way ANOVA work?
- Just use 2 groups for the illustration example – hypothetical weights for 2 groups of fish
- (a) Calculate the overall mean & the group means
- (b) The total variability is the sum of squares of the distances of each point from the overall mean; broken down into between-group variability & within-group variability
- (c) The between-group variability is the sum of squares of the distances from each point’s group mean to the overall mean
- (d) The within-group variability is the sum of squares of the distances from each point to its group mean
What are some other ANOVA tests?
Repeated measures ANOVA
Two way ANOVA
what is the purpose of the repeated measures ANOVA?
tests whether the means of two or more groups of related measurements are different
what is the purpose of the two way ANOVA?
tests the effect of two factors at once (e.g. crop yield effect by adding different amounts of nitrate and phosphate)
What is the Kruskal-wallis test?
Extension of the Wilcoxon rank sum test
What is the Chi-square test used for?
It is used to discover if there is a relationship between categorical variable
What are the assumptions of the Chi-square test?
Variables should be categorical (ordinal or nominal)
Should have two or more groups
All expected frequencies must be greater than 5
What is the purpose of the Chi-square test?
Test whether characteristics are different from expected values
Measures difference between the observed and expected values.
What is the null hypothesis of the Chi-square test?
the ratio obtained is equal to that expected
How do you determine the significance probability of the chi-square test?
Compare the value of 𝜒2with the critical value of the 𝜒2 statistic for (N-1) df (N is the number of character states, e.g. 2-1), at the 5% significance level
How do you decide to accept or reject the null hypothesis of the chi-square test?
If 𝜒2 ≥ critical value, reject the null hypothesis, concluding that the distribution is significantly different from expected
If 𝜒2 < critical value, the null hypothesis cannot be rejected, concluding that no significant difference from expected was found
How can the chi-square test be used for associations?
• Tests whether the character frequencies of 2 or more groups are associated in some way
Looks to see if the characteristic is distributed randomly or not
- Determining the expected frequencies is the tricky part
- Uses the same methodology once the expected values have been determined
What are the two main tests for relationships?
Linear regression
Correlation
What is the first thing to do when we Want to know if and how two sets of measurements (continuous variables) are associate
Draw a scatter plot
What is the outcome variable?
The dependent or response variable
What is the predictor variable
The explanatory or independent variable
How are the scatter plots for a relationship test composed?
Predictor variable on x axis and outcome on the y axis
What is the purpose of the linear regression test?
quantify the linear relationship between two sets of paired measurements
What are outliers?
Extreme values found in a data set that can skew the data and affect the mean.
What is the purpose of the correlation test for relationships?
test to determine whether there is a linear association between two sets of paired measurements
What are the assumptions of the correlation test?
At least one variable is Normally distributed
Linear relationship between variables
You can try to transform data to fulfil either of these requirements
What test do you use if the correlation assumptions cannot be met?
Pearson’s Rank correlation
What are the two types of Rank correlation test?
Spearmans
Kendalls
When do we use spearmans rank correlation test?
If a significance test is required then use spearmans
When do we use Kendalls rank test?
If there is an estimate of strength of correlation
How do we tell between association and relationships?
Are two variables associated
Correlation tests for a linear relationship
Linear regression quantifies the relationship