after midterm Flashcards
Normal distribution
- shows probability density
- takes 2 parameters the mean and standard deviation (or variance)
- common in nature and shows a lot in sampling
-symmetrical
-about 2/3 of random draws are within one standard deviation of mean - ## about 95% of random draws are within 1.96 (~2) standard deviations of mean
standard normal distribution
- mean is zero
- standard deviation is one
- its table gives the probability of getting a random draw from a standard normal distribution than a given value
How to convert any other normal to standard normal
Z= Y-mean/standard deviation
Y=value interested in
Z tells us how many standard deviations from the mean Y is
When are sample means normal distribution
if the variable itself is normally distributed
standard error of an estimate of a mean
the standard deviation of the distribution of sample means
Central limit theorem
The sum or mean of a large number of measurements randomly sampled for any population is approximately normally distributed, even if variable itself doesn’t have a normal distribution
compares a proportion to some hypothesized value of that proportion of a single categorical variable
Binomial test
About a single categorical variable with more that 2 possible values comparing data about frequencies to some distribution we hypothesis
chai square goodness of fit test
Numerical data comparing a mean of a group to some hypothesized value of that mean
one-sample t-test
comparing numerical groups with meaningful pairing comparing the differences of they pairs mean to some hypothesized value of the mean
paired t-test
compare two numerical groups to ask if they have the same mean
two-sample t-test
comparing the mean of multiple groups of a numerical variable and categorical
single factor ANOVA
asking if there is an association between two numerical variables and if so how strong is it
correlation -calculate r the correlation coefficient
Can we predict Y from X assuming linearity of the relationship
linear regression
weather two categorical variables are independent or associated (ie comparing two or more groups to ask if they have the same proportion of some response variable) with large amount of data
chai square contingency analysis i
Weather two categorical variables are independt or associated ( ie comparing two or more groups to ask if hey have some proportionof some response variable) with small amount of data
Fishers exact test
compare two numerical groups to ask if they have the same mean, allowing for variances to be different
Welchs t-test
Is there a normal distribution within a population or sample?
Shapiro-Wilk test
data from a single sample has a particular median, when normality is not there
sign test, not very powerful (non-parametric)
do two groups have the same distribution? not assuming normality
Mann-Whitney U test
do multiple groups have the same distributon not assuming normality
Kruskal-Wallis test
after you rejected null hypothesis of ANOVA, what pair group has a different mean?
Tukey- Kramer test
two catigorical explanitory variables wlith one numerical response variable. Asking if the first affect the mean does the second affect the mean and is there an interaction between the two catigorical explanitory variables
two-factor ANOVA
two explanatory variables, one is categorical and the other is numerical variable and one response. Asking if the categorical variable influence the response, does the numerical affect the response or is there a relationship between the two explanatory variables on the response
ANCOVA
relationship between two numerical variables without assumption of linearity
Spearman’s correlation
Trying to fit a parabula function to numerical variables
Quadratic regression
comparing two groups to see if they have the same variance for a numerical variable
Levenes test
use a numerical variable to predict a binary response variable
Logistic regression
a method to look for association between variables, hypothesis approach, using the data we already have
permutation
a method to use resampling in a computer to get a confidence interval or estimate, using data we already have
bootstrapping
How do you do a confidence interval for a proportion?
how do you do a confidence interval for the mean of a normally distributed variable
How do you do a confidence interval for the variance of a normal distributed variable
How do you do a confidence interval for the regression slope
Compare proportion to a constant
binomial
compare proportion from two groups
chai square contingency test, if not enough data fisher exact test
test independence of two categorical variables
chai- square contingency analysis
compare frequency data to a model
chai square goodness of fit test
compare a meant o a constant, assuming normal distribution
t-test
compare the means of two groups, assuming normal distribution
2 sample t-test
compare the mean difference of two groups, assuming normal distribution
paired t-test
compare means of more than two groups, assuming normal distribution
single factor ANOVA
compare a median to a constant not assuming normality
sign test
compare the distribution of two groups not assuming normality
Mann-whitney U test
compare median diff of two groups that are paired not assuming normality
sign test
Compare means of more than two groups not assuming normality
Kruskal wallis
test for independence of two numerical variables - with assumptions
corelation
test for independence of two numerical variables - not normal
spearman correlation
two numerical variables to predict one from another
linear regression if they fit a line
compare two slopes
ANCOVA
Test the interaction of two categorical factors effects on a numerical variable factor
multifactor ANOVA
Compare to a normal distribution
Shapero wilks test
compare the variances of two groups
Levenes test
predict a binary variable from a numerical variable
logistic regression
Do people who receive a vaccine differ from those who do not in whether they get a disease in the next 5 years
chai square contingency test if enough data, smalll data is fishers exact test
Do people who finish college on average get a higher income than people who do not?
two sample t-test
How can we use the wing length of a bumble bee to predict its maximum flying speed
linear regresion
how can we use the wing length of a bumble bee to predict its max flying speed?
linear regression
how much variation in flying speed of bumblebees is predicted by their wing lengths?
coefficient of determination (r^2)
IS the weight at age 2 different on average for sets of dogs that are fed one of 5 different kinds of dog foods?
single factor ANOVA
Does the number of yellow/ wrinkled, yellow/smooth, green/ wrangled, and green/smooth peas in a cross fit the 9:3:3:1 ratio predicted by Mendel?
chai square goodness of fit test
does the number of green and yellow peas from a cross fit the 3:1 ratio predicted by Mendel? (you only have 17 data points)
binomial
Does the growth rate of sea stars vary with temp?
correlation test if normal spearman’s correlation if not normal
Does the relationship of sea star growth rate and temp vary depending on whether calcium is added to the water not?
ANCOVA
Is the mean length of elephant trunks different in males and females
two sample t-test
are people equally likely to be born on each of the 7 days of the week
chai-squared goodness of fit
An experiment measured the effects of a treatment adding calcium to a diet a treatment adding selenium to a dies adding both or neither and measured the swimming speed of fish is there an effect on swimming speed of adding calcium adding selenium or an interaction between those two
multifactor ANOVA
Is the height normally distributed
shepero wilks test
Do trees with and without added fertilizer have the same variance in growth rate?
Levene’s test
Can we predict whether seed germinates or not from the weight of the seed?
linear regression
The developmental pathway leading to the formation of spots on butterfly wings has been studied by surgical excision of a small amount of tissue on the left wings of a set of butterflies with the right wings left untouched. The size of the spots on these wings was subsequently measured. How would you test whether the manipulation had an effect on spot size?
paired t-test
comparing a mean of a random sample from a normal population with the population mean proposed in a null hypothesis
One sample t-test
compares the means of two groups without requiring the assumption of equal variance
Welch’s t-test
comparing the central tendencies of two groups using ranks
Mann-Whitney U test
used for hypothesis testing on measures of association without assuming normality
permutation tests
a non-parametric test comparing variance of multiple groups, using ranks of the data points
Kruskal-Wallis test