week 2 Flashcards
Suppose that the population distribution is normal with µ = 120 and σ2= 100.
If samples of size n = 25 are drawn at random, over repeated studies, what is the distribution of the sample mean?
Xbar ~ N(120, 4)
1st part u =120
2nd part o2/n
There are two types of estimation.
- point estimation
- interval estimation
point estimation
A point estimation is a single value estimate of something (e.g., population parameter).
§ e.g., using the sample mean to estimate the population mean.
interval estimate
An interval estimate uses a range of plausible values to
estimate something.
Frequentist perspective and Bayesian perspective uses
different ways to compute interval estimate for a population parameter.
Frequentist perspective uses _________________________ as an interval estimation for a parameter.
and interpretation?
confidence interval
Interpretation: over repeated studies, 95% of (the endpoints of) the confidence intervals contain the true population parameter.
The endpoints of the CIs are considered random variables but the population parameter is considered constant.
Bayesian perspective uses______________________________ as an interval estimation for a parameter.
credible interval
§ Interpretation: There is 95% chance that the population parameter is in the computed credible interval.
§ The population parameter is considered a random variable.
CI formula ish
95% CI = [Lower Endpoint, Upper Endpoint]
95% CI = Sample Statistic +/- tcriticalSE
P(Lower End (random) <Parameter (constant) < Upper End (random) = 0.95
main purpose of the CI
tells us the precision of our point estimate.
CIs with narrower widths indicate more precision than those wider widths.
Recall, the standard error affects the width of the CI.
More precision means that the sample statistic will fluctuate less over repeated samples.
In other words, the width of the CI (or the standard error of the estimate) matters more than the obtained endpoints of the CI.
In hypothesis testing, we first assume that H0 is___________
TRUE
And then we are trying to look for evidence against the H0.
If we have enough evident against H0, we reject the H0. endorse H1.
If we don’t have enough evidence against H0, we fail to reject H0. still assume H0 is true.
Hypothesis testing is rooted in the ___________ perspective
What does it involve imagining?
Hypothesis testing is rooted in the Frequentist Perspective.
involves imagining the experiment being repeated over and over again.
Bayesian statistics does not use hypothesis testing to make conclusions.
One-sample z-test and one-sample t-test are used when…
One-sample z-test and one-sample t-test are used when we want to test whether the population mean of a single sample is equal to a specific value.
e.g., Suppose µ is the population mean of IQ scores.
H0 : µ =100; H1 : µ/= 100
Whether we should use z-test or t-test depends on what?
Whether we should use z-test or t-test depends on whether we know the value of the population variance σ2 or not.
If we know the value of σ2, we use z-test.
If we don’t know the value of σ2, we use t-test.
p-value is computed by…
The p-value is computed by comparing our obtained or realized sample mean with the sampling distribution of X¯ assuming H0 is true.
What is the P value?
Assuming H0 is true, the probability of obtaining a sample mean that is as extreme as or more extreme than the one we obtained or realized with our sample
The p-value is a way to quantify the evidence against H0.
A smaller p-value indicates more evidence against H0.
how is Theoretical sampling distribution of the sample mean or
z-stats assuming H0 is true derived
derived based on the central limit theorem.
The t random variable is derived based on ….
The t random variable is derived based on a sequence of independently and identically distributed normal random variables
T stat formula
t = xbar - u/ s/sqrtn
with n-1 df
what is the difference between z-statistic and t-statistic?
The difference between z-statistic and t-statistic is that in z-statistic, we use the population variance but in t-statistic, we use the sample variance.
In other words, to compute t-statistic, we only need sample data.
Comparing T and z distribution
Both t-distribution and z-distribution are symmetric and bell-shaped
§ t-distribution with a small df has fatter tails than z-distribution.
§ t-distribution reaches the standard normal distribution as df (or n) reaches infinity (a.k.a. convergence in distribution).
§ recall df “ n ´ 1.
Hypothesis Testing Steps
- Set up the hypotheses.
§ must be about a population parameter
§ e.g., µD for paired-samples t-test; µ1 ´ µ2 for independent-samples t-test; β1 for regression coefficient’s t-test - Figure out the test statistic to use.
§ z-stats, t-stats, or F-stats - Assuming H0 is true, construct the sampling distribution of
the test statistics.
§ standard normal distribution for z-stats
§ t-distribution for t-stats
§ F-distribution for F-stats - Compute the obtained/realized test statistic computed using the sample data.
§ e.g., compute the obtained z-stats, t-stats, or F-stats - . Compute the p-value by comparing the obtained test statistic with the sampling distribution of the test statistic
under H0.
§ smaller p-value indicates more evidence against H0 assuming H0 is true.
§ Non-directional:
§ p-value: assuming H0 is true, the probability of obtaining a sample test statistic that is as extreme as or more extreme than the one you obtained with your sample. - Compare the obtained p-value with the α level and decide whether or not to reject the H0.
§ Researchers usually set α “ 0.05.
§ Reject H0 if p ď α.
§ a.k.a., the result is (statistically) significant.
§ Fail to reject (or retain) H0 if P ą α.
§ a.k.a.,the result is not (statistically) significant. - Make a conclusion regarding whether to endorse the statement stated in H0 or H1.
§ If H0 is rejected, there is enough evidence to endorse the H1.
§ If H0 is not rejected, there is not enough evidence to endorse the H1.
§ need to put the H1 in the context.
One of the limitations of hypothesis testing (or science in general) is that…
it is impossible for us to prove something is
true in the population.
Related to the fact that science uses inductive reasoning.
§ In contrast, math uses deductive reasoning, therefore, a math theorem can be proven true.
Therefore, it is possible that our final decision of hypothesis testing is FALSE.
Conditional in the world where H0 is true
(H1 is false) AND H0 is retained
Correct Decision
§ Probability is 1 - α.
Conditional in the world where H0 is true
(H1 is false) AND H0 is rejected
Type I Error
Probability of making type I error is called type I error rate denoted by α (a.k.a., significance level).
Conditional in the world where H0 is false
(H1 is true) AND H0 is retained
Type II Error
§ Probability of making type II error is called type II error rate denoted by β.
Conditional in the world where H0 is false
(H1 is true) AND H0 is rejected
Correct Decision
§ Probability of making this correct decision is called power, which equals
to 1- β.