Two Sample Problems & Bivariate Distributions Flashcards
What is the estimator for the difference in population means?
The difference in sample means which for sufficiently large independent samples has an approximately normal sampling distribution with mean of the true difference and variance σa2/na + σb2/nb
What are the null and alternate hypotheses for a two sample hypothesis test?
μa = μb, μa - μb ≠ / > / < 0
What is the test statistic for a two sample problem if both samples have n > 20?
The difference in sample means over the square root of the sample variance of the estimator replacing population variance with sample variance
What is the confidence interval for the effect of a change?
The difference in sample means ± z1 - α times the standard error of the estimator
What is bootstrapping?
Pulling many observations from the same dataset by drawing samples with replacement
What is the joint (bivariate) distribution of 2 random variables X and Y?
f(Y, X) = f1(Y|X) f2(X) where f1 is the conditional distribution and f2 is the marginal distribution
What is the conditional expectation of discrete Y given X = Xj?
μY|Xj = E(Y|X = Xj) = ΣkYkpk|j
pk|j is the conditional probability of Y = Yk given X = Xj
What can be used to characterise a linear relationship between X and Y?
The covariance cov(X, Y) = E((X - E(X))(Y - E(Y)))
Covariance gives whether there is a positive, negative, or no linear trend (+ve covariance => x and y are above and below their means at the same time)
What is the expectation of some function h of random variables X and Y?
E(h(X, Y)) = ΣxΣyh(x, y) f(x, y)
For continuous x, y, replace Σ with ∫
Covariance is a specific case of this formula with h(X, Y) = (X - E(X))(Y - E(Y))
What is the correlation coefficient?
A scale free measure of the linear relationship between two variables
ρ = cov(X, Y)/sqrt(V(X)V(Y))
What is the sample analogue of covariance?
SXY = 1/(n-1) Σi = 1n((xi - x̄)(yi - ȳ))
What is the sample analogue of the correlation coefficient?
r = SXY/SxSY