Exam 2 Flashcards
why would you used t instead of z?
z is a theoretical distribution
df
n-1
how do you add more uncertainty?
replace the funky o with an s in the SEM equation
what’s the other name for a t-distribution?
Student’s t-distributions
how are t and z-distributions different?
t has more area under the curve tp accommodate more uncertainty
how are t and z-distributions alike?
both have normal, bell-shaped curves
as df gets bigger how does that affect a t distribution?
looks more like a z distribution
how is a t table organized?
rows are df
columns are probabilities
single samples
no control group, one group of people, used to establish norms
paired samples
one group of people but use two different treatments
independent t-test
2 groups with different treatment but doesn’t assume equal variance
point estimator
difference between sample means (X1-X2)
what are the 2 ways to calculate degrees of freedom?
Welsh method and conservative method
f-test or Levene’s test
variance test to see if two samples are similar
what do you do if two samples are similar?
use an equation that uses a combined variance estimate (gives more degrees of freedom)
what do you do if two samples aren’t similar?
use less degrees of freedom
ANOVA
one-way analysis of variance
ANOVA definition
test group means for a significant difference
2 components of ANOVA
variance between groups and variance within groups
MBS
mean square between
MBS definition
quantifies the variance of group means around the group mean (variance between groups)
MSW
mean square within
MSW definition
quantifies the variability of data points in a group around its mean (estimate of the variance within groups)
f-statistic
ratio of the MSB and MSW
post hoc hypothesis
formal tests that are used in delineating
2 methods of post hoc hypothesis
least squares difference (LSD) method and bonferroni method
LSD method
only used after a significant ANOVA test and planned comparisons
Bonferroni’s method
ensures that the family-wise error rate is less than or equal to alpha after all possible pair-wise
homoscedastic
equal in variance
heteroscedastic
unequal in variance
three methods of assessing group variances
- graphical exploration
- summary statistics
- hypothesis tests of variance
scedastic
variance of a random variable
nonparametric tests
encompass a broad array of statistical techniques used to analyze data
rank tests
class of nonparametric test that make fewer assumptions about distributional shape
Kruskal-Wallis test
nonparametric analogue of one-way ANOVA
family-wise error rate
probability of at least one false rejection of null hypothesis
how can you increase the alpha error
multiple tests
when do you reject the null hypothesis
> 0.05 or the range doesn’t include the null
what test do you use if you don’t know the direction of the alternative hypothesis?
two-tailed
confounded correlation
looks like correlation but there’s a 3rd thing that causes the correlation
regression
how much x explains y
LINE
linearity, independent observations, normality, equal
what’s the slope if there’s no correlation?
0
what does correlation only apply to?
linear relationships
what do you split the Y value into?
residual and predicted
explanatory variable (x)
- independent variable
- factor
- treatment
- exposure
response variable (y)
- dependent variable
- outcome
- response
- disease
correlation coefficient
r
least squares regression line
y=a+bx
simple regression
single explanatory variable (X) and response variable (Y)
multiple regression
multiple explanatory variables (X1, X2 etc) in relation to a response variable (Y)
k
number of explanatory variables
standardized coefficients
predicted change in Y per unit increase in X
residual
difference between observed response and response predicted by regression model
why do we use multiple regression models?
helps to “adjust out” the effects of lurking variables
what type of variable is an ANOVA?
categorical explanatory variable, quantitative response variable
does correlation mean causation?
hell nah
coefficient of determination
r^2
CoD
amount of y that is explained by x
distance of point to the line
residual error
slope
change in y per unit of x
when do you create dummy variables?
when there are 3+ levels
how many dummy variables should there be?
number of levels - 1
SEM
standard error of x-bars
does the 95% CI get smaller as n increases?
ya
when do you use a two-tailed test?
when you don’t know the direction of the alternative
when is it easier to reject the null?
when the variances are equal
when can you use a t-test?
when the data is normal and the n is large
family-wise error rate
probability of making a type 1 error