Sampling Flashcards
Recall,
(i) sampling is necessitated by
the desire to use sample values to estimate population parameters
(ii) Parameters are
population estimates
(iii) Statistics are
sample estimates
Definitions
Estimation: Estimation is the process of
using sample statistics to make estimates of the population parameter
Estimator: A statistic which is used to
estimate a population parameter is an estimator. To this end, x ̅ is an estimator of X ̅ and in the same vein, X ̅ is an estimator of µ
Where
x ̅ = sample mean, X ̅ = finite population mean and µ = population mean (non-finite)
Estimate: The numerical value of
the estimator
If x ̅ is used to estimate X ̅ and x ̅ = 45,
Then, x ̅ is an estimator and 45 is the estimate
Types of estimate
Types of Estimates There are basically two types of estimates, they are: Point and interval estimates (i) Point Estimates: A point estimate is a single value used to estimate the parameter under consideration. For example, an education inspector wishes to know the average performance of Accounting students in CBS 211. He collects the scores of 5 students from the class of 75 as follows: 90, 55, 70 50 45
x ̅ = (90+55+70+50+45)/5 = 310/5 = 62.
Here, x ̅ = 62 is a point estimate because it is a single value
Interval estimates: An estimate is called an interval estimate if it lies within a range, thus if the value of x ̅ lies within a range, say between 59 and 65 (59 ≤ x ̅ ≤ 65), then, the estimate is called an interval estimate because its value lies between two points
A pertinent concern to statisticians is finding an estimator that can be used to
obtain the best estimate of the population parameter.
The solution to this concern is a function of the properties of estimators. There are four main properties of estimators. They are:
Un-biasedness: An unbiased estimator is one whose expected value equals that of the population parameter. Thus, statistic s is an unbiased estimator of the population parameter µ if E (s) = µ. i.e. the average value of the statistic taken over all possible samples of specified size is equal to the population parameter.
e.g. The sample mean x ̅ is an unbiased estimator of the population mean µ.
ProofE
x ̅ = 1/n ∑(i-1)^n▒xi and µ = 1/N ∑(i=1)^N▒X_i
E (x ̅) = E1/n (∑_(i=1)^n▒x_i ) = 1/nE(∑▒x_i ) . . . (i)
= 1/n E(∑▒〖a_i x_i 〗) a_i = 1 if a_i is a member of the sample
= 0 otherwise
P (a_i is a member of the sample) = n/N
Therefore, E (x ̅) = 1/n E (∑(i=1)^N▒X_i )
E (x ̅) = 1/n x n/Nx ∑(i=1)^N▒X_i = 1/N ∑_(i=1)^N▒X_i = µ
QED
Also, the sample variance is an s^2 = 1/(n-1) ∑_(i=1)^n▒(x-x ̅ )^2 is an unbiased estimate of the finite population variance S^2 = 1/(N-1) ∑_(i=1)^N▒(X-X ̅ )^2 . . . (iii) Show that (iii) is true
Consistency: An estimator is said to be a consistent estimator of the population parameter if the probability that its estimate will approach the true value of the population parameter increases as the sample size increases. Thus, the statistic S_n is a consistent estimator of the population parameter θ if for any small positive value ε, Lim Pr(|S_n- θ|
Un-biasedness, consistency, sufficiency and efficiency
Un-biasedness: An unbiased estimator is one whose expected
value equals that of the population parameter. Thus, statistic s is an unbiased estimator of the population parameter µ if E (s) = µ. i.e. the average value of the statistic taken over all possible samples of specified size is equal to the population parameter.
e.g. The sample mean x ̅ is an unbiased estimator of the population mean µ.
ProofE
x ̅ = 1/n ∑(i-1)^n▒xi and µ = 1/N ∑(i=1)^N▒X_i
E (x ̅) = E1/n (∑_(i=1)^n▒x_i ) = 1/nE(∑▒x_i ) . . . (i)
= 1/n E(∑▒〖a_i x_i 〗) a_i = 1 if a_i is a member of the sample
= 0 otherwise
P (a_i is a member of the sample) = n/N
Therefore, E (x ̅) = 1/n E (∑(i=1)^N▒X_i )
E (x ̅) = 1/n x n/Nx ∑(i=1)^N▒X_i = 1/N ∑_(i=1)^N▒X_i = µ
QED
Also, the sample variance is an s^2 = 1/(n-1) ∑_(i=1)^n▒(x-x ̅ )^2 is an unbiased estimate of the finite population variance S^2 = 1/(N-1) ∑_(i=1)^N▒(X-X ̅ )^2 . . . (iii) Show that (iii) is true
Consistency’s: An estimator is said to be a consistent estimator of the population parameter if the
probability that its estimate will approach the true value of the population parameter increases as the sample size increases. Thus, the statistic S_n is a consistent estimator of the population parameter θ if for any small positive value ε, Lim Pr(|S_n- θ|
Efficiency: if we have two estimators which have little or no bias, it is reasonable to
prefer the estimator that has the smaller variance for the given sample size because its outcomes will tend to lie closer to the population parameter. This leads us to the property of relative efficiency.
If two estimators S_1 and S_2 based on the sample sizes are unbiased, the one with the smaller variance is said to have greater efficiency than the other. That is, if
Var (S_1) < Var (S_2) and E (S_1) = E (S_2) = θ, then S_1 is relatively more efficient than S_2 in estimating θ.
e.g. for a random variable from a normal population, both the sample mean (x ̅) and median m_d are estimators of the population mean µ. However, it can be shown that:
σ^2(m_d) = 1.57 σ^2/n while 〖σ^2〗_x ̅ = σ^2/n for random sampling from a normal population when the sample size n is large. Hence, for given n_j, 〖s^2〗_x ̅ < 〖s^2〗_md. Consequently, x ̅ is relatively more efficient than m_d as an estimator of µ.
Sufficiency
A sufficient estimator is one that
gives as much information as possible about the population parameter from the sample. It is a statistic that captures all information in the data that is relevant to the population parameter
E.g. it can be shown that for SRS from a normal population, the sample mean x ̅ is a sufficient estimator of the population mean µ because it utilizes all the information about µ (1/N ∑_(i=1)^N▒X_i ).
Thus, once the sample mean is known, any other statistic computed from the sample data, such as the sample median, or the sample mid-range will provide will provide no further information about µ.