distribution for CI
distribution for minimal sample size
2E and E
- 2E = full range of CI
- E = half of CI range = margin of error
- probability of correct decision
- higher sample sizes yield higher power
influence of sample size
the same deviation from H0 with more data yields a lower p-value
bootstrap CI
- sample from the original dataset
- enlarging B will reduce the variation
bootstrap test
- sample from H0 distribution
- compare t-value of original data to surrogate T* values
- p-value is determined by proportion of T*-values exceeding the t-value of the data
sign test: test statistic
- number of observations that are different from m0
- binomial test is done on this outcome
wilcoxon signed rank test
- requires symmetric population
- one sample or (difference between) matched pairs (wilcox.test() with 1 argument)
- lose a lot of information but is really robust
paired sample permutation test:
- what is permuted
- what is logic behind this
- permute original (x,y) labels
- under H0 of no difference between distributions of X and Y within pairs, permuting the labels should not chsnge the distribution of T
how to test dependence in two paired samples
- pearson’s correlation test
- spearman’s rank correlation test
two paired samples tests
- sign test
- wilcoxon signed rank test
- uses wilcox.test() with 1 argument - permutation test
- t.test(x,y,paired=TRUE)
two independent samples tests
- mann-whitney test
- kolmogorov-smirnov test
- t.test(x, y)
mann whitney test
- based on ranks
- uses wilcox.test() with 2 arguments
kolmogorov-smirnov test
- tests in distributions are the same
- differences in histograms
- T - max vertical difference in summed histograms
one-way ANOVA
- NI experimental units
- I = 2 = two-sample t-test
- always right sided
SSa and RSS
- SSa: variance due to factor
- RSS: variance not explained by factor in the model
kruskal wallis test
- nonparametric anova
- based on ranks
- distribution of W under H0 = X^2(I-1)
independent samples permutation test
- what is permuted
- what is logic behind this
- 1way ANOVA
1. group labels are permuted
2. permutation of groups should not affect group means if there is no effect
two way ANOVA
- NIJ experimental units
- main and interaction effects are tested
- I + J + 1 linear restrictions: treatment and sum parametrizations
F statistic
- always right sided
- explained variance/unexplained variance
interaction plot
interaction shows up as nonparallel curves
testing interaction
- model that includes interaction –> only significance of interaction effect is relevant
- model without interaction –> additive model. check for presence of main effect
block designs in 2way anova
- randomized block design
–> block = variable not of interest
–> dont look at significance of block variable in output - repeated measures
–> block = ID
–> exchangeable case: errors within a single unit are exchangeable, meaning that ordering is irrelevant
–> lack of exchangeability makes the block design invalid - friedman test
–> nonparametric for 2 designs above
block designs for random effects
- crossover design
–> 2 outcomes per experimental unit (paired samples)
–> apply treatment in opposite orders between conditions
–> treatment, learning, and sequence effects - split plot design
–> 2 treatment factors (independent samples)
–> subplot and whole plot
- to get p-values, anova(reduced model, full model)
- (1|f) for random effect block
unbalanced design
- order of variables in the model matters
- variable of interest goes last
- otherwise, p-values are unreliable
difference RBD and split-plot design
- 1 level of blocks
- fixed effects
- 2 levels of (randomized) blocks (whole and subplots)
- mixed effects
fixed and mixed designs
1. one way ANOVA
2. two way ANOVA
3. randomized block design
4. repeated measures block design
1. crossover design (paired)
2. split-plot design (independent)
contingency tables
- count of units in cross categories
- test statistic: difference between expected and observed counts
- always right sided (1-chisq())
fisher’s exact test
- for 2x2 tables
- odds ratio is used
simple linear regression
comparable to pearson’s correlation test
- will give exactly the same t-score and p-value
multiple linear regression
- multiple explanatory variables
- to find the best parameters, we minimize the sum of squared differences (SSE)
global model fit
- sigma hat squared: residual standard error
- R^2: proportion of explained variance compared to base model Y = B0 + e
- F-statistic and overall p-value
- all of these are found at the bottom of the output
coefficients in multiple linear regression
- not all variables have explanatory power
- we need to find the relevant ones by testing for individual coefficients
- these are found in the individual rows of the output
step up and step down method
step down: remove highest nonsignificant variable
step up: add significant variable that yields maximum increase in R^2
preferred linear model has
- least variables
- highest R^2 (or only slight decrease)
- interpretability
confidence interval
for population mean Ynew value
prediction interval
- for individual observation of Ynew
- larger interval than CI as the error is taken into account
model assumption linear regression
- linearity of the relationship
- normality
extremely low or high observation on the response variable
leverage point
extremely low or high observation on the explanatory variable
effect of leverage point
- can be studied by testing model fit with and without the leverage point
- if parameters change drastically by deleting this point, it’s called an influence point
- cook’s distance quantifies the influence of an observation on predictions (>1)
mean shift outlier model
- dummy vector with all 0s but 1 at outlier index
- include as variable in the model
- if variable is significant, the outlier is significant
- linear relations between explanatory variables, meaning they explain the same
- straight line in scatterplot
- reflected in large variances and large CIs –> unreliable estimates
how to investigate collinearity
- pairwise linear correlations
- VIF factor. (>5 = concern)
- extends ANOVA by including one or more variables that are expected to influence the dependent variable, but are not of primary interest
- adjusts the DV for the covariates by holding them constant
- variable not of interest is continuous (unlike RBD)
- the only relevant p-value is for the variable of interest
summary() parameter estimates
gives coefficient estimates as difference between ai and a1
gives us p-values, t-statistics, etc
interaction between relevant factor and irrelevant variable (ANCOVA)
- H0: B1 = … = Bi
- parallel lines = no interaction
- modeled with B_i instead of gamma
- look at interaction p-valye in the output, the other values should be calculated separately
order of factors
- does not matter in balanced ANOVA
- matters in unbalanced ANOVA
- matters in ANCOVA (always)
- matters in logistic regression (always)
family wise error rate
- probability of making a Type I error (false positive) when multiple comparisons are being testsed
- to provide FWER < 0.05, we use the bonferroni correction (alpha_ind = 0.05/m)
multiple testing arises when
- there are many parameters of interest
- investigating all differences between factprs pf a set of effects in ANOVA
simultaneous testing
- usually everything is compared to B1 or a1. this is not simultaneous testing
- tukey etc. show adjusted p-values for simultaneous testing of all Bs
logistic regression
- binary outcome
- linear model for the log odds
- probability of success
log odds
- log odds = log (p(success)/p(failure) = model
- odds = e^model
a change delta in the linear predictor
multiplies the odds by e^delta
linear predictor
coefficient or additive model
poisson regression: lambda
- if Y ~ poisson(lambda), then E(Y) = var(Y) = lambda
- the larger the lambda parameter, the larger the values of Y on average, and the larger the spread in the values of Y
- for very large values of lambda, the poisson distribution is approximately normal
lambda is modelled as
- log(lambda) = model
- lambda = e^model
- QQplot is not useful here
survival analysis
- analysis of lifetimes
- survival function: probability of survival until time t
hazard function
- rate of dying within a short interval
- how likely the event is to happen at a particular moment in timee
- incomplete observation of the survival time of a variable
- (di = Ti < Ci) = event has not happened yet
Kaplan-Meier estimator of the survival function
- only categorical IVs
- survival probabilities for specific times
Nelson Aalen estimator of the cumulative hazard function
- step function increases only at times where events occur
log rank test
- tests whether 2+ survival curves are identical
- can only deal with grouped data
proportional hazards model
- unlike KM model, can take many - kinds of predictors
- main feature: coefficients can be estimated by maximizing the partial likelihood
treatment parametrization
- 1 group is a reference group
- ai are expressed as difference between a1 and ai
- can be set with ‘contrasts’ command
sum parametrization
- ai are expressed as deviations from the mean
- combined ai average is 0