DECK 12: INFERENCE PART A (1 samp hyp tests and intervals) Flashcards
notation: what is mu
true population mean (average)
notation: what is p
true population proportion (percent in the population)
notation: what is x-bar
mean of your sample
notation: what is p-hat
sample proportion (percent in our sample)
notation: what is a p-value
At the end of a hypothesis test, it is the likelihood of getting your results if the null was true.
notation: what is z*
critical z, how many SE you are reaching up and down in a confidence interval for proportions
notation: what is t*
critical t, how many SE you are reaching up and down in a confidence interval for means
notation: what is mu - mu
true difference between two populatinon means
notation: what is p - p
true difference between two population proportions (percents).
notation: what is xbar- xbar
difference between two sample means
notation: what is phat - phat
difference between two sample proportions
notation: what is Ho
The NULL, the dull, the “things haven’t changed” hypothesis
notation: What is Ha
The alternative. This is what you are trying to prove.
What is the difference between the distribution of a sample and a sampling distribution?
A distribution of a sample is just a histogram of the DATA in a sample. A sampling distribution is made from an bunch of sample STATISTICS. It is the distribution of the statistic that was calculated from those many many samples.
What is a sampLING distribution?
a pile of statistics. A pile of p-hats or x-bars.
Are models what really happen?
No. A model train is not a real train. We use models to say what kind of happens.
What is “statistically significant?”
When our sample statistic is so far away from what we were expecting that we don’t think that it was due to random sampling error. Then is statistically significant. When p-value is below the alpha, we say “statistically significant”.. Low p-values are statistically significant.
What is the differnce between standard error and standard deviation?
Standard error is the typical distance a STATISTIC is from the mean in a sampling distribution (pile of a bunch of sample’s statistics) and Standard Error is the typical distance a DATUM is from the mean in a pile of raw data.
What does CLT say about the distribution of the population?
Not much… just that it doesn’t matter what it is.. With large samples.. The SAMPLING dist will be approx normal (dist of stats.. NOT DATA)
What are the mean and standard deviation of a sampling distribution for a proportion?
mean is p and sdandard deviation is root pq/n (look at formula sheet) N(p, root (pq/n) )
What does Central Limit Theorem Say?
It basically says.. NO MATTER WHAT SHAPE THE POPULATION IS (normal, bimodal, uniform, skewed, crazy.. ) If you make a histogram of a bunch of means taken from a bunch of samples, that histogram will be unimodal and symmetric WITH LARGE ENOUGH SAMPLES.. Close to normal. So.. A nerdy way to say it is: The sampling distribution of means is approximately normal no matter what the population is shaped like. The larger the sample size, the closer to normal. (the normal curve is just a model.. the sampling distribution is close to it, but not it! we use the model anyway!)
What is difference between population of interest and parameter of interest?
Population is the WHO (subjects you measure, beads people) Parameter is the actual number you want (like % of or AVG)
What happens to a pile of statistics if you take larger samples?
All of the x-bars or all of the p-hats will get closer to eachother, and closer to the parameter ( mu or p)
What does the CLT say about the distribution of actual sample data?
Nothing? The sample will be distributed similar to the population. Bimodal populations have bimodal samples. The CLT only talks about distributions (histograms) of sample statistics, of summaries, which are groups of means.., NOT OF INDIVIDUALS!!!! NOT DATA
N ( ?1 , ?2 ) what does this mean?
it means NORMAL models centered at ?1 With a standard deviation of ?2
Describe the distribution of a sample
It will look like the population. The distribution of a sample is a histogram made from the sample, which will look kind of like the population. If the population is bimodal, then the distribution of the sample is bimodal. The SAMPLING distribution of a bunch of means, however, will look normalish.
What is a standard error?
The typical, or expected, error. It is how far off you are expecting your statistic to be from the parameter. It is calculated like the standard deviation, but we are using sample statistics.. We don’t know the true parameters, so we estimate with statistics adding error to our calculation
How do statistics from big samples compare to small? (notice this doesn’t ask about DATA)
Larger sample statistics have less variablility, so statistics from larger samples are closer to eachother and to the parameter. Statistics from smaller samples are more spread out, further away from true parameter.
What is statistical inference?
Using a statistic to infer something about a parameter.. Basically, using a sample to say something about a population.
what is a statistic
some numerical summary of a sample.. Could be the mean of a sample, the standard deviation of a sample, the proportion of successes in a sample, the slope calculated from a sample, a difference of 2 means from 2 samples, a difference of 2 proportions from 2 samples, a difference of 2 slopes from 2 samples.. you can make sampling distributions for any of these, and they will all be centered around the parameter…
what is a parameter?
some numerical summary of a population. Often called “the parameter of interest.” It is what we are often trying to find.. It doesn’t vary. It is out there and STUCK at some value, it is the truth, and you’ll probably not ever know it! We try to catch them in our confidence intervals, but sometimes we don’t (and we don’t know it!). It Could be the mean of a population, the standard deviation of a population, the proportion of successes in a population, the slope calculated from a population, a difference of 2 means from 2 population, a difference of 2 proportions from population
What is the Fundemental Theorem of Statistics?
The CLT!! The Central Limit Theorem!
What is sampling variability?
same as sampling error. The natural variation of sample statistics.. NOT DATA.. Samples vary. so do their statistics.. Parameters do not vary!
What is sampling error?
same as sampling variability.. The natural variability between STATISTICS.. NOT DATA!!! . We call it error EVEN THOUGH YOU MADE NO MISTAKES!!!
what happens to t models as n gets larger?
The models look more like the normal model. An infinite sample size would give a t model identical to the normal model.
What is an unbiased estimator?
When the sampling distribution (pile of sample stats) is centered on the true population parameter.