Stats. T2 Flashcards
What is the difference between a parametric and non-parametric statistical model?
a parametric model that assumes a parametric family, such as a binomial or a poisson. A non parametric doesn’t.
How can you estimate the method of moments
- Identify the parameters to estimate
2.
Calculate the necessary sample moments, where the formula for the kith sample moment is
1/n(sum(xi^k)
3.
Set the sample moments equal to the theoretical moments and solve for the parameters:
m1 = μ
m2 =. μ^2. + σ^2
What is a prediction interval
A prediction interval is an interval of values with a specified probability of containing the future data value
What is the change of variable formula
f(h^-1(y)) = d/dy(h^-1)(y)
What is the moment generating function
a mathematical tool used to characterize its probability distribution by generating its moments. 𝑀𝑋(𝑡) = 𝐸(𝑒𝑡𝑋).
where X is a random variable
When do random variables X and Y have the same distribution?
When the MGF are the same
what is the theorem that links the mgt and derivatives
if an MGF(x) of t exists, then the kth moment of E(x^k) canbefound by taking the K’th derivative of MGFx(T) and evaluating at 0
What is the central limit theorem
If you take sufficiently large random samples from any population (with finite mean and variance) and compute their means, the distribution of those sample means will be approximately normal (Gaussian), regardless of the original population’s distribution.
What is the sample distribution
The sampling distribution is the probability distribution of a given statistic (e.g., mean, variance, proportion) computed from multiple random samples of the same size drawn from a population.
What is the standard error
The standard error is the standard deviation of the sampling distribution. The standard deviation of an estimate of the sampling distribution is called the estimated standard error.
What is the mean squared error of an estimator
Let 𝜃̂ be an estimator for 𝜃. The mean squared error of it is
=𝐸 {(𝜃^−𝜃)^2 }.
The mse is the expected squared distance between estimates, 𝜃, and the true
value, 𝜃. A large mse means that estimates tend to be a long way from the true value; a small mse means that estimates tend to be close to the true value. So we prefer estimators with small mse.
what is the bias
The bias of 𝜃^ is Bias(𝜃^) = E(𝜃^)-𝜃
Given two unbiased estimators, we would prefer the one with smaller variance (and hence smaller mean squared error). If one estimator has a smaller bias but a larger variance than another estimator then we have to decide whether bias or variance is more important, or just prefer the one with smaller mean
squared error.
What is a confidence interval
An interval estimator that contains 𝜃 with probability 𝑝 for all true values of 𝜃 is called a 𝑝-confidence interval, and 𝑝 is called the confidence level or coverage of the confidence interval
what is a simple and a composite hypothesis
A hypothesis that specifies the model distribution completely is known as a simple hypothesis. A hypothesis that does not specify the model distribution completely is called a composite hypothesis
What is the critical region in a hypothesis test
The set of outcomes for which you will reject 𝐻0 is called the critical region
what is a type 1 or 2 error
Rejecting the null hypothesis when it is true is known as a type I error; failing to reject the null hypothesis when it is false is known as a type II error.
What is the power of a hypothesis test
This latter probability is known as the power of the test and equals the probability that we shall (correctly) reject the null hypothesis when the al- ternative hypothesis is true.
how do you conduct a hypothesis test
- Write down your null and alternative hypotheses, 𝐻0 and 𝐻1, about the population quantity.
- Choose the significance level, 𝛼, and power, 𝛽, at which you wish to conduct the test.
- Find a critical region, 𝐶, for which the significance level is 𝛼.
- Find the sample size, 𝑛, for which the power is 𝛽.
- Collect your data.
- If the data fall inside the critical region then reject 𝐻0 in favour of 𝐻1.
what is a test statistic(T)
a value collected from sample data that determines whether to reject the null hypothesis
What is a z test
a hypothesis test used when the SD is known the sample is large enough.
z = (x-mu)/(variance-root(n)))
what is the test statistic when variance is unknown but population is large
Y-mu/sampleSD-root(n)
What is the P value
how improbable our observed test statistic would be if the null hypothesis were true