DECK 7 Confidence Intervals and Hypothesis Tests part A Flashcards
notation: what is mu
true population mean (average)
notation: what is p
true population proportion (percent in the population)
notation: what is x-bar
mean of your sample
notation: what is p-hat
sample proportion (percent in our sample)
notation: what is a p-value
At the end of a hypothesis test, it is the likelihood of getting your results if the null was true.
notation: what is z*
critical z, how many SE you are reaching up and down in a confidence interval for proportions
notation: what is t*
critical t, how many SE you are reaching up and down in a confidence interval for means
notation: what is mu - mu
true difference between two populatinon means
notation: what is p - p
true difference between two population proportions (percents).
notation: what is xbar- xbar
difference between two sample means
notation: what is phat - phat
difference between two sample proportions
notation: what is Ho
The NULL, the dull, the “things haven’t changed” hypothesis
notation: What is Ha
The alternative. This is what you are trying to prove.
What is the difference between the distribution of a sample and a sampling distribution?
A distribution of a sample is just a histogram of the DATA in a sample. A sampling distribution is made from an bunch of sample STATISTICS. It is the distribution of the statistic that was calculated from those many many samples.
What is a sampLING distribution?
a pile of statistics. A pile of p-hats or x-bars.
Are models what really happen?
No. A model train is not a real train. We use models to say what kind of happens.
What is “statistically significant?”
When our sample statistic is so far away from what we were expecting that we don’t think that it was due to random sampling error. Then is statistically significant. When p-value is below the alpha, we say “statistically significant”.. Low p-values are statistically significant.
What is the differnce between standard error and standard deviation?
Standard error is the typical distance a STATISTIC is from the mean in a sampling distribution (pile of a bunch of sample’s statistics) and Standard Error is the typical distance a DATUM is from the mean in a pile of raw data.
What does CLT say about the distribution of the population?
Not much… just that it doesn’t matter what it is.. With large samples.. The SAMPLING dist will be approx normal (dist of stats.. NOT DATA)
What are the mean and standard deviation of a sampling distribution for a proportion?
mean is p and sdandard deviation is root pq/n (look at formula sheet) N(p, root (pq/n) )
What does Central Limit Theorem Say?
It basically says.. NO MATTER WHAT SHAPE THE POPULATION IS (normal, bimodal, uniform, skewed, crazy.. ) If you make a histogram of a bunch of means taken from a bunch of samples, that histogram will be unimodal and symmetric WITH LARGE ENOUGH SAMPLES.. Close to normal. So.. A nerdy way to say it is: The sampling distribution of means is approximately normal no matter what the population is shaped like. The larger the sample size, the closer to normal. (the normal curve is just a model.. the sampling distribution is close to it, but not it! we use the model anyway!)
What is difference between population of interest and parameter of interest?
Population is the WHO (subjects you measure, beads people) Parameter is the actual number you want (like % of or AVG)
What happens to a pile of statistics if you take larger samples?
All of the x-bars or all of the p-hats will get closer to eachother, and closer to the parameter ( mu or p)
What does the CLT say about the distribution of actual sample data?
Nothing? The sample will be distributed similar to the population. Bimodal populations have bimodal samples. The CLT only talks about distributions (histograms) of sample statistics, of summaries, which are groups of means.., NOT OF INDIVIDUALS!!!! NOT DATA