Tests to know for the final Flashcards
Binomial test
non-parametric
comparing a proportion to a hypothesized value; uses data to test whether a population proportion, p, matches a null expectation for the proportion
H0 = the population proportion of an outcome equals a specific hypothesized value
Assumptions:
1. samples are mutually independent
2. random samples
Examples:
Is there evidence that people prefer one type of food over another?
X^2 Goodness-of-fit test
non-parametric
comparing a proportion to a hypothesized value. Used if sample size is too large for the binomial test. Also used to compare freq. data to a probability distribution
H0 = there is no significant difference between the observed and expected value; no relationship between the categorical values (are independent)
Assumptions:
1. Random
2. Independent
3. No more than 20% of the categories have an expected < 5
4. No categories with expected </= 1
Examples:
Does the month of birth determine if someone can make it into the NHL?
X^2 contingency analysis
non-parametric
tests the independence of 2 or more categorical variables. Same assumptions apply from X^2 goodness test
H0 = there is no significant difference between the observed and expected value; no relationship between the categorical values (are independent)
Assumptions:
1. Random
2. Independent
3. No more than 20% of the categories have an expected < 5
4. No categories with expected </= 1
Examples:
Does being infected with Toxoplasma affect the chance of having a car accident?
One-sample t-test
parametric
compares the mean of a random sample from a normal pop. with the pop mean proposed in a null hypothesis
H0 = the population mean equals the specified mean value
Assumptions:
1. The variable is normally distributed
2. The sample is a random sample
3. Data are independent
4. No significant outliers
Examples:
Is the average healthy human body temperature 98.6 F?
Paired t-test
parametric
compares the mean of the differences to a value given in the null
H0 = the true mean difference between the paired samples is 0
Assumptions:
1. Pairs are chosen at random
2. Subjects must be independent
3. The differences have a normal distr.
* does not assume that the individual
values are normally distributed,
only the differences
Examples:
Is there a difference in ER visits a week before and after 4/20 compared to 4/20?
2-sample t-test
parametric
compares the means of a numerical variable between 2 populations
H0 = the mean between two groups are equal
Assumptions:
1. Random and independent sample
2. Data in each population are normally distributed
3. The variance of each population is equal
Examples:
Are mosquito biting rates affected by beer consumption?
Sign test
non-parametric equivalent of one-sample t-test and paired t-test
compares the median of a sample to a constant specified in the null. Just want to know if there’s a difference, not the magnitude of the difference
H0 = The median of a distribution is equal to a specific hypothesized value
Assumptions:
1. Don’t have to assume that the data is normally distributed
Single factor (one-way) ANOVA
parametric
like a t-test such that it compares group means, but it’s able to examine more than 2 means
H0 = there is no difference among group means
HA = at least one group differs significantly
Assumptions:
1. Random samples
2. Samples are independent
3. Normal distr. for each pop.
4. Equal variances for all pops.
Examples:
Do the body temperature of squirrels differ in low, medium, and hot temperatures?
Pearson correlation (r)
parametric
describes the relationship between 2 numerical variables. “r” is the correlation coefficient
H0 = the correlation coefficient (r) equals the hypothesized value, meaning the two variables are not correlated
t0.05(2), df = n - 2
Assumptions:
1. Random sample
2. X is normally distributed with equal variance for all values of Y
3. Y is normally distributed with equal variance for all values of X
Examples:
Are the males and females in a pair correlated in their arrival dates after migration?
Linear regression
parametric
assumes that the relationship between X and Y can be described by a line (equation: Y = a + bx). Usually determines if we can predict the value of a variable based on the value of another variable
H0 = the population slope that models the variable as a function is zero
Assumptions:
1. Random sample of Y values for each X
2. Y is normally distributed with equal variance for all values of X
3. Relationship follows a line
Examples:
Is it possible to predict a person’s age based on dental 14C?
Fisher’s exact test
non-parametric
used for 2x2 contingency analysis to determine associations between 2 categorical variables. Doesn’t make assumptions about the size of expectations and is cumbersome by hand. Good to use if the assumptions of the X^2 contingency analysis aren’t met
H0 = no difference between the categorical variables; no relationship
Assumptions:
1. Random samples
2. Independent samples
Examples:
Are western and easter forms actually reproductively isolated, and therefore separate species?
Welch’s t-test
parametric
compares the means of 2 groups without requiring the assumption of equal variance. Can be used as an alternative to the 2-sample t-test when it doesn’t meet the assumption of equal variance
H0 = means between 2 groups are equal
Assumptions:
1. Normally distributed variables
2. Does not assume equal variance
Shapiro-Wilk test
non-parametric???
used to statistically test whether a set of data comes from a normal distribution
H0 = the data is normally distributed
Mann-Whitney U test
non-parametric
compares the central tendencies (either mean or median) of only 2 groups USING RANKS. Wants to examine whether 2 samples come from the same population. Can be used as a non-parametric alternative to the 2-sample t-test
H0 = two samples have the same mean and are derived from the same population (ie. the 2 populations have the same shape)
Assumption:
1. Both samples are random samples
Examples:
Garter snake resistance to newt toxins
Tukey-Kramer test. Also, why can’t we just use a series of 2-sample t-tests?
non-parametric???? or parametric because it assumes normality???
compares group means to all other group means. Must be done after finding variation among groups with single-factor ANOVA and after the null hypothesis is rejected after ANOVA
We can’t use a series of 2-sample t-tests because multiple comparisons could cause the t-tests to reject too many true null hyps. Tukey-Kramer adjusts for the number of tests
H0 = the means of each groups are equal
Assumptions:
1. Random samples
2. Independent samples
3. Assumes each group has a normal distribution
4. Equal variance between within groups associated with each mean???
Kruskal-Wallis test
non-parametric version of the single-factor (one-way) ANOVA test. It uses the RANKS of the data points. The difference between ANOVA and Kruskal is that ANOVA tests to equality of the means of values, while the Kruskal (and Mann-Whitney) is the comparison of the mean ranks
Applies to 3 or more samples
Follow X^2 distribution
H0 = the mean ranks of the groups are equal
Assumption:
1. Variances between the groups do not have to be equal
2. Does not assume normality
Multifactor (2-factor) ANOVA
parametric
ANOVAs can be generalized to look at more than one categorical variable at a time. Allows us to ask whether each categorical variable affects a numerical variable, and if these categorical variables interact to affect the numerical variable
H0 = Factor A has no effect on the mean of Y
H0 = Factor B has no effect on the mean of Y
H0 = Factors A and B don’t interact in their effects on Y
Assumptions:
1. Random samples
2. Samples are independent
3. Normal distr. for each pop.
4. Equal variances for all pops.
Examples:
Plants are taken from 2 regions and the 2 watering treatment is either wet or dry. So, we are looking at weather the mean root length is the same for all regions, if the mean root length is the same for all watering treatments, and if there is no interaction between region and watering treatment for determining mean root length
Spearman’s correlation
non-parametric
determines how strong a correlation is between 2 variables (ex. if A inc., does B inc. or dec.?) using ranks, also determines the direction. Alternative test to correlation that doesn’t make so many assumptions
H0 = there is not a significant correlation between the 2 variables; ρ (rho) = 0
Assumptions:
1. Does not rely on normality
Examples:
Is the difficulty of describing the rope trick correlated to the time elapsed since it was observed?
ANCOVA - analysis of covariance
parametric
combines ANOVA and regression analysis. Basically tests the main and interaction effects of categorical variables on a continuous dependent variable, controlling for the effects of selected other continuous variables, which co-vary with the dependent
At least one variable is numerical
Looks at slope
H0 = the slopes of the regression lines (b) are all equal
Examples:
Do island and/or mainland type affect the number of species?
Polynomial/Quadratic regression
parametric???
a form of regression analysis where the relationship between the independent variable and the dependent variable is modelled as a polynomial
Levene’s test
parametric
compares the variances of 2 (or more) groups
H0 = the variance among groups is equal
Assumptions:
1. probably normality
Examples:
Is there a lot more variance among males in reproductive success than we think?
Logistic regression
parametric
tests for a relationship between a numerical variable (explanatory variable) and a binary variable (response variable). Response is usually a 0 or 1 (ex. death or survival)
H0 = there is no relationship between the variables
Assumptions:
1. Independence
2. Probably random samples
3. Probably normality
Examples:
Does the does of a toxin affect probability survival?
Permutation tests
non-parametric
used for hypothesis testing on measured of association (of 2 variables). Mixes the real data randomly. Good to use if the sample size is very large, even if the sample isn’t normal
Done without replacement - all data points are used exactly once in each permutated data set
Examples:
Sage crickets sometimes offer their hind-wings to females to eat during mating. Do females who eat hind-wings wait longer to re-mate?
F-test
parametric
compares variances of two groups. Because of its assumption that both distributions are normal, Levene’s test is more often used
Assumption:
1. Data comes from a normal distribution
Bootstrap
non-parametric
a hypothesis test that is a method of estimation (and confidence intervals). It resamples a single dataset to create many simulated samples
Done with replacement: a value can be used repeatedly int the resamples
H0 =
Assumptions:
1. Not usually normally distributed
2. Independence
3. Random
Examples:
Can also used the sage cricket example