Misc econometrics Flashcards
p-value
(In statistical hypothesis testing) The probability of obtaining test results at least as extreme as the results observed, assuming that the null hypothesis is correct.
In other words, The p-value is the largest significance level at which we could carry out our test and still fail to reject H₀
In still other words, the probability associated with our calculated test statistic (Z-statistic corresponding to our observed value (and the distribution assuming H₀ is true))
Z-statistic
Z-statistic (or Z-score or Standard score) is a number representing how many standard deviations an observed value (raw score) is away from the mean (of what is being observed).
Raw scores above the mean have positive standard scores, while those below the mean have negative standard scores.
The Z-statistic is distributed normally with mean = 0 and variance = 1 (ie, it has a Standard Normal Distribution)
Z ~ N(0, 1)
So Z = (Observed Sample Value - Assumed Population Mean) / Standard Deviation of Sample Distribution
(Note: in hypothesis tests, the observed value is often the mean observed from our sample - in other words, we are testing whether the mean. The Z-statistic may also be used to estimate the probability that X could take a certain value (the observed value, x), given the assumed population mean value.)
(Note: Calculating z using this formula requires the population mean and the population standard deviation, not the sample mean or sample deviation. But knowing the true mean and standard deviation of a population is often unrealistic except in cases such as standardized testing, where the entire population is measured.
When the population mean and the population standard deviation are unknown, the standard score may be calculated using the sample mean and sample standard deviation as estimates of the population values.)
Z-statistic (conversion)
If X ~ N (μ, σ²) then
Z = (𝐗−𝛍)/𝛔 ~ N(𝟎, 𝟏)
In other words: For any continuous and randomly distributed variable X with mean μ and variance σ² (X ~ N (μ, σ²)), all probabilities can be converted to the Standard Normal Distribution using the Z Normal (0, 1) transformation: Z = (𝐗−𝛍)/𝛔
The Z-statistic is distributed normally with mean = 0 and variance = 1 (ie, it has a Standard Normal Distribution)
Therefore the Z Normal transformation, Z = (𝐗−𝛍)/𝛔 converts the variable X into a Standard Normal distribution. We can thus use standard normal tables to find relevant probabilities for P(X ≤ x).
Note: the Z Normal (0, 1) transformation is so called because it serves to transform the distribution of X to a normal distribution centered on 0 with a Variance of 1, by way of shifting the normal distribution leftward by μ units (aligning mean with x-axis) and compressing(/stretching if σ²<1) horizontally by σ units (setting σ²=1)
Thus, Z ~ N(0, 1)
So, to convert any value of X to its corresponding Z value, subtract the value of the mean and divide by the standard deviation.
Z-statistic (hypothesis tests)
is a number representing how many (of the sample distribution’s) standard deviations the observed (sample) value is away from the assumed (population) mean.
Random Sample
A sample of n observations of a RV Y, denoted Y₁, Y₂, …, Yₙ is said to be a random sample if the n observations are drawn independently from the same population and each element in the population is equally as likely to be selected
Random Sample as ‘A set of Independently and Identically Distributed (IID) RVs’
We describe such a sample as being a set of Independent and Identically distributed (IID) Random Variables (RVs)
So, if a random sample of n elements is taken,
the sample elements constitute a set of IID RVs, Y₁, Y₂, …, Yₙ, each of which have the same PDF as that of Y
The random nature of Y₁, Y₂, …, Yₙ reflects the fact that many different outcomes are possible before the sampling is actually carried out (ie, each element of the sample is a (IID) RV (with the same PDF as Y (population)) BECAUSE they are randomly (and independently) selected from the population, meaning that each element from the sample follows a PDF identical to the population
Sample data
Once the sample is obtained, we have a set of numbers, say y₁, y₂, …, yₙ which constitute the data we work with.
This are different types of data:
• Cross-sectional data
• Time-series data
• Panel data
Sample Statistics
A sample statistic is any quantity computed from values in a sample that is used for a statistical purposes.
(Statistical purposes include estimating a population parameter, describing a sample, or evaluating a hypothesis)
The two most often used sample statistics are the sample mean, denoted by Y̅, and the sample variance, denoted by S².
Sampling Distribution
A sample statistic (eg the sample mean) will have its own probability distribution called the sampling distribution.
Since each observation in a random sample is itself a RV, then any statistics calculated from a sample, called a sample statistic, is also a RV.
And since the sample statistic is an RVs, it will have its own probability distribution
The sampling distribution reflects the fact that a random sample (of size n) drawn from the population could materialise into a range of different manifestations, each with a corresponding probability. It is this probability distribution (that contains the information) of the all the possible samples that we could draw of size n from the population, that we call the sampling distribution (and we will see shortly that it distributes normally with mean μ and variance σ²/n)
Population > Sampling Distributions
So, to tie sampling distributions in with their wider context,
- There is a POPULATION (of size N)
- Y is a RV representing this population, with a PDF
- θ is an unknown population parameter (such as the expected value E(Y) or variance V(Y) (or) σ²)
- Note: these population parameters are unknown, fixed values
- A random sample (of n observations) of the RV Y is drawn, denoted Y₁, Y₂, …, Yₙ
- (Once the sample is obtained, we have a set of numbers, say y₁, y₂, …, yₙ, which constitute the data we work with)
- Each Yᵢ has a PDF (identical to the PDF of Y)
- From the sample we can calculate sample statistics
- (Two sample statistics of interest: sample mean, Y̅ and the sample variance, S²)
- Note: these sample statistics are RVs, with their own probability distribution, the SAMPLING DISTRIBUTIONS
The sampling distribution of the sample mean (Y̅)
Suppose Y ~ N (μ, σ²) and we have an IID sample of n observations from it: {Y₁, Y₂, …, Yₙ},
Then we say that Yᵢ ~ IIDN (μ, σ²)
In other words, each element of the sample is a RV with the same PDF as Y.
From these observations we can calculate the
sample mean, Y̅, as: Y̅ = 1/n Σ (Yᵢ)
Since Y̅ is a RV itself, it has a probability distribution.
It turns out that the sampling distribution of the sample mean is: Y̅ ~ N (μ, σ²/n)
(We’ll break this down in the next three cards)
The mean (or expected value) of the sampling distribution of Y̅
The mean of the sampling distribution of Y̅ is defined as:
E[Y̅] = μ
Interpretation:
If a sample of n random and independent observations are repeatedly and independently drawn from a population, then as the number of samples becomes very large (approaches infinity), the mean of the sample mean (Y̅) approaches the population mean
The variance of the sampling distribution of Y̅
The (population) variance of the sampling distribution of Y̅ is defined as:
V[Y̅] = σ²/n
Interpretation:
As the sample size (n) increases, the variance of Y̅ decreases. So the sampling distribution of the sample mean will have lower variance the larger the sample size.
The Sampling Distribution (of Y̅ ~ )
Thus, if we assume that the samples are taken from a normal RV, Y, we can deduce that:
Y̅ ~ N (μ, σ²/n)
Standardisation of Y̅
We can compute the standard normal for Y̅ to calculate probabilities:
Z = [ ( Y̅ - μ ) / ( σ/√n ) ] ~ N(0, 1)
The Central Limit Theorem
What about the shape of the sampling distribution of Y̅ if the population from which it is constructed is not normally distributed?
Use the Central Limit Theorem (CLT): as sample size gets large enough, the sampling distribution of Y̅ can be approximated by the normal distribution even if population itself is not normal.
Therefore, given the CLT, we can apply rules about normal distribution to the sampling distribution of the sample mean even when the population is not distributed normally
Population variance/sample variance unknown
It thus follows that we can make inferences about the population mean based on the sample mean using the standard normal distribution (Z-statistic)
Z = [ ( Y̅ - μ ) / ( σ/√n ) ] ~ N(0, 1)
However, notice how the distribution of the sample mean depends on the population mean (= sample mean) but also on the population variance (divided by the sample size).
It is quite likely that we will not know the population variance
If this is the case we can use the sample variance, S², as an approximation, and it can be shown that:
T = [ ( Y̅ - μ ) / ( S/√n ) ] ~ t (n - 1)
Thus, we can use the sample variance and the tables from the t distribution to make inferences when σ² is unknown.
Estimator
A sample statistic which is constructed to provide information about the unknown population parameters of a probability distribution is called an estimator and we denote it by θ ̂ (thetaHat)
(To place in context:)
Let Y be a RV representing a population with a PDF, f (y; θ), which depends on the unknown population parameter θ
Example: if Y ~ N ( μ, σ²); then = θ = ( μ; σ²)
Note: we will generally assume that there is only one parameter
If we can obtain certain random samples, then we can learn something about θ
(Refer to first point)
So a sample statistic is an estimator, and so the probability distribution of the estimator is the sampling distribution
Estimator as a rule
More generally, an estimator θ ̂ (thetahat) of a population parameter θ can be expressed as a mathematical formula (rule):
θ ̂ = g(Y₁, Y₂, … , Yₙ)
In other words, regardless of the outcome of the RVs (the sample that happens to be drawn from the population), we apply this same rule to estimate the population parameter
An estimator of θ is a rule that assigns each possible outcome of the sample a value of θ
(remember any sample drawn is one manifestation of the many possible samples that could have been drawn from the population (with corresponding probabilities))
For example: a natural estimator of µ (population mean) is Y̅ (sample mean)
where Y̅ = (1/n) Σ Yᵢ
- Given any outcome of the RVs {Y₁, Y₂, … , Yₙ } (ie: the sample drawn) the rule to estimate the population mean is the same: we simply take the average of {Y₁, Y₂, … , Yₙ }
- For a particular outcome of the RVs {y₁, y₂, …, yₙ }, the estimator is just the average in the sample y̅ = (1/n) Σ yᵢ
Quality of Estimate vs Quality of Estimator
Question:
Suppose that we want to estimate the average salary of university graduates in the UK. Suppose that we take one sample from the population and use the sample mean to estimate the average population salary. Suppose that we find that the sample mean is y̅ = £15,000. How close is this value (estimate) to the true population mean, µ?
Answer:
We don’t know, as µ is unknown!
➢ Instead of asking about the quality of the estimate, we should ask about the quality of the estimation procedure or estimator!
➢ ie How good is the sample mean as an estimator of the population mean?
➢ What are some (desirable) properties that an estimator may (or may not) possess?
Such properties, are most often divided into:
• small sample (or finite) properties - desirable properties for when the sample size is finite
• large sample (or asymptotic) properties - desirable properties for when the sample size becomes infinite
We will briefly consider the two main properties for estimators of ‘Finite or Small Samples’:
1) Unbiasedness
2) Minimum variance
Unbiasedness
An estimator is unbiased if:
E[θ ̂ ] = θ
So if the mean of the sampling distribution of the estimator (which reflects all the different possible values that the sample statistic could assume when the estimating procedure is applied to whatever sample happens to be drawn, with corresponding probabilities) is equal to the population mean, then the estimator is unbiased.
In other words, if you take independently draw a large number of random samples from the population, computing the sample statistic for each, and then find the mean of these sample statistics, for an unbiased estimator this will be equal to the population mean.
(Part 1; topic 5 shows really clear graph to demonstrate)
Minimum Variance Unbiased Estimator
Consider the set of all possible unbiased estimators for θ, which we will label θ ̂₁, θ ̂₂, …, θ ̂ₖ. One of these θ ̂ⱼ is said to be the Minimum, Variance Unbiased Estimator if:
V( θ ̂ⱼ ) < V( θ ̂ₖ )
for i = 1, … , k and i ≠ k
(Part 1; topic 5 shows really clear graph to demonstrate)
Efficient
If an estimator is unbiased AND minimum variance, we say it is efficient (or the best)
How to construct estimators with good properties for unknown parameters?
There are various approaches based on observed samples. Three common methods are:
• Least Squares
• Method of Moments
• Maximum Likelihood
➢ In this course we will focus on the Least Squares Estimation.