Applied Economics & Statistics: Topic 5 - Estimation and Confidence Intervals* Flashcards
Why do we estimate?
- We usually don’t have access to data for the whole population.
- Inferential Statistics estimates population parameters from sample statistics.
Example: Determine the mean income of UK residents.
We don’t have access to everyone’s income values, and it’s too costly
and time consuming to collect this information.
Instead, take a random representative sample of UK residents (using
an appropriate sampling technique).
Use an estimator, the sample mean, to estimate the population value
for mean income: a point estimate
Describe what makes a good estimator in statistics
- We want our estimator to be accurate.
- If we don’t know the population parameter, how do we know the
estimator is close to it? - Estimator must have “good properties.”
Specifically, it must be unbiased and efficient.
When is an estimator bias?
An estimator is biased if the mean of its sampling distribution is not
equal to the population value of interest.
Explain whether the ‘sample mean’ is bias or not
We know from the Central Limit Theorem that the sample mean is an
unbiased estimator of the population mean, because its sampling
distribution is centred on the population mean
When choosing between multiple unbiased estimators, which one is preferred?
The one with
least variance (more efficient) is preferred.
Describe how we can tell when an estimator has a high efficiency
An estimator with high efficiency will have a low standard deviation in
its sampling distribution.
So a high degree of clustering of values
What does it mean when an estimator has low standard deviation?
It has a high efficiency
Describe how the efficiency of estimators can be improved
- We can improve the efficiency of estimators by increasing the size of
the sample used. - Recall the standard deviation of the sample mean is σ/√n.
- So as n increases, the standard deviation gets smaller
What does increasing sample size do to an estimator?
It increases its efficiency
What does an unbiased and efficient estimator do?
It gives us the best chance of
getting an estimate close to the true value
Which estimator gives us the best chance of
getting an estimate close to the true value?
Unbiased and efficient estimator
What does a biased but efficient estimator do?
It would provide estimates that are
clustered but around the wrong value
Which estimator would provides estimates that are
clustered but around the wrong value?
A biased but efficient estimator
What does a biased and inefficient estimator do?
We wouldn’t even get close on
average, let alone with a single sample.
What type of estimator wouldn’t even get close on
average, let alone with a single sample?
A biased and inefficient estimator
In practice, describe which estimators we use
In practice, we might be willing to use an estimator that is biased but the most efficient, if its bias is small. But of course we always prefer
to use the most efficient (minimum variance) and unbiased (on average correct) estimator
What are ‘estimates’?
Values derived from a sample and used to approximate
population values
What’s a ‘point estimate’?
The statistic, computed from sample information, that estimates a
population parameter. Example: the maximum temperature tomorrow will be 15C. We can think
of this as our “best guess” as to what tomorrow’s temperature will be
What’s the name for ‘the statistic, computed from sample information, that estimates a
population parameter’?
Point Estimate
What are ‘confidence intervals’?
A range of values constructed from sample data so that the population parameter is likely to occur within that range at a specified probability. The specified probability is called the confidence level. Example: the maximum temperature tomorrow will be between 13C and
17C
What’s the name for ‘a range of values constructed from sample data so that the population parameter is likely to occur within that range at a specified probability’?
Confidence Intervals
What are confidence levels often stated as?
- Confidence intervals are often stated as CI = point estimate ± margin of
error:
1. We find a point estimate by using the sample mean, ̄x , as we have
already seen in previous topics.
2. But we also state this as a CI for which we provide a range of
values and a measure of how certain we are that this range contains
the true population mean value.
Example: For example: “At a confidence level of 95%, between 52% and
58% of Americans favour the death penalty for people convicted of
murder.” i.e. “The support for death penalty is currently 55%(±3%).”
What are the 2 possible situations for calculating confidence intervals for a population mean?
- We use sample data to estimate μ with ̄x and the population standard deviation σ is known.
- We use sample data to estimate μ with ̄x and the population standard deviation σ is unknown. In this case, we use sample standard deviation s
Describe & explain how we find CIs when σ is known
- The distribution of the sample mean is:
̄x ∼ N (μ, σ^2/n),
and we know that
z = ( ̄x − μ) / (σ/√n) ∼ N(0, 1). - We use this information to determine probabilities associated with ̄x
taking values in certain ranges, and from that build CIs - We often want to find a 95% confidence interval, i.e. one in which we are
95% confident the true population mean lies.
For what values of z would there be 95% of the area under the standard
normal distribution curve? You can read this from a z table and you
should find that the value is 1.96. So, 95% of the area lies in the interval
(−1.96, +1.96), or:…