Statistics & Properties Flashcards
Sample variance formula
Sum (Xi - X bar)^2 / (n - 1)
Skewness formula & what does it measure?
Measure of asymmetry Sum of (Xi - X bar)^3 / ns^3
Is skewness scale invariant?
Yes
Values of skewness for no/positive/negative skews
Skewness = 0 = symmetric (normal distribution)
Skewness > 0 means positive / right skew
Skewness < 0 means negative / left skew
Relation between mean, median and mode for positive skew
Mode < median < mean
What is Kurtosis & formula
Measure of how many observations lie in the tails of the distribution. Not signed.
Sum of (Xi - X bar)^4 / ns^4
Values of kurtosis
Kurtosis = 3 = normal distribution
Kurtosis < 3 = platykurtic (flat topped)
Kurtosis > 3 = leptokurtic (peaked)
Sum of (Xi - X bar)^2 can be simplified to
Sum of Xi^2 - nX bar^2
Is covariance scale invariant?
NO - Cov(2X, Y) = 2Cov(X, Y)
Formula for correlation in terms of other measures
RXY = COV(X, Y) / sqrt V(X) V(Y)
Is correlation scale invariant?
YES
What values does rxy lie between?
-1 and +1
Define estimator
A random variable that is a function of the data
Define estimate
An actual value drawn from the sample e.g. Sample mean = 2
What is an unbiased estimator?
E(Theta hat) = Theta
- the mean of the sample description theta hat is centred on the population mean
Two examples of unbiased estimators
Sample mean X bar: E(X bar) = M
Sample variance S^2: E(S^2) = Sigma ^2
What is an efficient estimator?
Efficient = smallest variance
Compare the efficiency of one individual vs sample mean
Variance of the sample mean < variance of any 1 individual hence sample mean is more efficient.
What is central limit theorem?
Whatever the distribution of X, provided that sigma ^2 is finite, as n becomes large, the distribution tends towards a normal distribution. N>25/30.
When using CLT for discrete distributions, what MUST we remember to do?
Using normal continuous as approx to discrete = need continuity correction.
E.g. P(X < equal to 21) = P(X < equal to 21.5) then convert to Z.
P(X > 100) = P(X > equal to 101) = P(X > equal to 100.5)
What is max likelihood estimation?
Suppose we observe heads = 20 and tails = 30 in 50 trials. We want to find an estimate for p(success) in a binomial distribution which maximises the chance this outcome occurs.
What is a consistent estimator?
The probability limit of theta hat = theta.
The probability of the difference between theta hat and theta exceeding the allowed error goes to zero as n gets bigger.