stats final Flashcards
t-test variables
independent: categorical w/ 2 levels
dependent: numerical
ANOVA variables
independent: categorical w/ 3+ levels
dependent: numerical
paired t-test variables
independent: categorical w/ 2 levels
dependent: numerical
there must be equal numbers in each category between the pair
Chi-Squared variables
independent: categorical w 2+ levels
dependent: categorical w 2+ levels
linear regression variables
independent: numerical
dependent: numberical
null hypothesis
independent variable has no effect on the dependent variable
alternative hypothesis
independent variable has an effect on the dependent variable
p-value
probability that the null is true, must be lower than 0.05/5% to reject null
type 1 error
the chance of the probability that we reject the null when the null is true
equal to p-value
stats relies on 2 fundamental ideas
uncertainty and variation
random sampling
removes bias
systematic sampling
transects, collect data while walking straight thru area, etc
mixed/stratified sampling
taking samples randomly within certain sections of an overall space
haphazard sampling
sampling whatever is accesible
standard error of the mean
estimate of how close your sample mean is compared to the true population mean
central limit theorem
if a population with finite variance is sufficiently sampled, the mean of all samples from population will be abt equal to the mean of the population and the means from the samples will approach a normal distribution
descriptive test
dont carry out hypothesis, just describe situation
histograms, density plot, boxplot
differences test
set out to compare 2 sets of data
barcharts and boxplots
correlation/regression test
emphasis on linking variables
scatterplots and lineplots
association
looking for links between variables that are categorical
percentage of population that lies within 1.96 st dev of a normally distributed mean
95%
% of population that lies within 1 st dev of the a normally distributed mean
68%
how is standard error of the mean different than standard deviation
more precise, but not more accurate
when is log normal distribution appropriate
positive skewed data
sample standard deviation
square root of the average squared differences from the mean
test done if ANOVA results are significant
tukey’s hsd test
non parametric alt to ANOVA
kruskal-wallis test
test done if kruskal wallis test is significant
dunn test
non parametric alt to paired t-test
wilcoxon matched pairs test
f-stat
group variance
non parametric alt to 2-sample t-test
mann whitney u-test
non parametric alt to correlation/regression
spearman’s rank test