UNIT 5 Flashcards
Are models what really happens?
No. A model train is not a real train. We use models to say what kind of happens.
Can you accept a null hypothesis?
No.
Never accept a null.
You can only fail to reject it.
Can you decrease alpha while increasing power
(even though they move together?).
***** THINK OF ALPHA BETA POWER DIAGRAM*****
INCREASE SAMPLE SIZE
Yes.. They move up and down together with constant sample size.
If you increase the sample size, you can decrease alpha and increase the power.
Can you draw the alpha/beta/power diagram?
***** THINK OF ALPHA BETA POWER DIAGRAM*****
See page 486. Be able to draw and label this the way we do in class with the box and “RETAIN REJECT” up top and “Ho TRUE, Ho FALSE” on left.
Can you make a 100% confidence interval?
Sure, I’m 100% confident that it will snow between 0 and 500 feet tomorrow.
or I AM 100% confident that between 0% and 100% of people smoke.
tells you nothing
Can you prove a null hypothesis true?
NO
Evidence is for or not for the alternative
Null is just waiting to be rejected
Describe the distribution of a sample
It will look like the population. The distribution of a sample is a histogram made from the sample, which will look kind of like the population. If the population is bimodal, then the distribution of the sample is bimodal.
The SAMPLING distribution of a bunch of means, however, will look normalish.
Do parameters vary?
NO!!!
Statistics do.
statistics vary from sample to sample
PARAMETERS DO NOT VARY!
they are just stuck there. Over time they may change, but at a moment, they are stuck.
Do you use p-hat or p-null when you check the success/failure condition?
use p null
how are alpha and beta related?
***** THINK OF ALPHA BETA POWER DIAGRAM*****
as one increases, the other decreases, and vice versa
THEY DO NOT add up to one. We don’t know what they add to, just that they are on opposite sides of rejection threshold.
how are beta and power related
***** THINK OF ALPHA BETA POWER DIAGRAM*****
as one increases, the other decreases, and vice versa.
They have to because they BOTH ADD TO ONE!!!
Power + Beta = 1
How are power and alpha related?
***** THINK OF ALPHA BETA POWER DIAGRAM*****
they go up and down together
If you are testing to see if more students use tobacco now, and you find that there was not enough evidence to say that more do, even though more actually do now, what type of error did you make?
***** THINK OF ALPHA BETA POWER DIAGRAM*****
Type 2 error
If you are testing to see if more students use tobacco now, and you find that was enough evidence to say that more do, but actually, there was not an increase, what type of error did you make?
***** THINK OF ALPHA BETA POWER DIAGRAM*****
Type 1 error
How can you decrease alpha and beta at the same time?
***** THINK OF ALPHA BETA POWER DIAGRAM*****
increase sample size.
this will also increase power
How can you increase power?
***** THINK OF ALPHA BETA POWER DIAGRAM*****
Increase alpha
or
increase sample size..
How do statistics from big samples compare to small?
Larger sample statistics have less variablility, so statistics from them are closer to the parameter and eachother (sampling distribution has smaller standard error).
Statistics from smaller samples are more likely to be far away from true parameter.
How do you write conclusion if you fail to reject?
With a p-value this high. I fail to reject the null. There is not enough evidence to say that more students like eggs now.
How do you write conclusion if you reject?
With such a low p-value, I reject the null hypothesis. There is strong evidence that the proportion of students who eat rice has changed.
How else can you explain power?
The likelihood you correctly reject a false null.
likelihood you detect something that is there.
How is a confidence interval made?
statistic +- margin of error
Statistic +- (crit * s.d )
. Stand at the statistic, reach up and down a margin of error, and hope that you catch the parameter.
How wide is a confidence interval?
It is 2 margins of error wide,
If the null is false, what is the only error you could make?
***** THINK OF ALPHA BETA POWER DIAGRAM*****
Type 2
If the null is true, what is the only error you could make?
***** THINK OF ALPHA BETA POWER DIAGRAM*****
Type 1
If you fail to reject, what is the only type of error you could make?
***** THINK OF ALPHA BETA POWER DIAGRAM*****
Type 2
If you reject, what is the only type of error you could make?
***** THINK OF ALPHA BETA POWER DIAGRAM*****
Type 1
N ( 15 , 8 ) what does this mean?
it means NORMAL models centered at 15 With a standard deviation of 8
One tail or 2 tailed? How do you tell?
if it just says “changed” or “different”..
Then it is 2 sided. DOUBLE THE P VALUE after normcdf!
If it says “more” “less than” “greater” etc.. Then it is just one sided..
What are conficence intervals for?
They are an attempt to say what the true population parameter is..
It is our best guess. A parameter catcher.
“We think that there will be between 6 and 12 inches of snow?”
we may be wrong
What are the 3 steps in hypothesis testing AFTER YOU CHECK CONDITIONS?
- Make your Ho and Ha
- Make a Null Model (centered at null, use your Ho as center and in calculations, use your sample size).. This is a sampling distribution for the statistics if the null were true.
- THINK and DO MATH. See if your statistic may have come from the null model..
(p-hat, x-bar, phat1-phat2, xbar1-xbar2)
What are the conditions that have to be met in order to use a normal model for the distribution of sample proportions? (sampling distribution of proportions).. (the distribution of p-hats)..
- Randomization (this helps with assumption of independence
- SMALL ENOUGH SAMPLE … 10% condition (this is the upper limit of our sample size. above this, the sampling distribution starts looking leptokurtic (thinner and taller), not normal)
- LARGE ENOUGH SAMPLE.. success/failure: np and nq > 10. this is the lower limit of our sample size. It is when the sampling distribution starts looking normal.
What are the mean and standard deviation of a sampling distribution for a mean?
mean is mu and
standard deviation is sigma/root n
(look at formula sheet)
N(mu, sigma/rootn)
What are the mean and standard deviation of a sampling distribution for a proportion?
mean is p
and standard deviation is root pq/n
(look at formula sheet)
N(p, root (pq/n) )
What is a point estimate?
Your statistic.
You stand at the point estimate and reach up and down to make an interval
What are we confident in?
our confidence lies in our interval.
if we took another sample.. We’d have a different interval..
Your confidence interval is (.25, .35).
What is your margin of error?
0.05
Your confidence interval is (.25, .35). What is your statistic?
0.3 (UB+LB) / 2,
avg of the numbers,
in the middle
Your confidence interval is (.25, .35). What is your point estimate?
0.3
(UB+LB) / 2,
avg of the numbers,
in the middle
Your confidence interval is (.25, .35). What is your standard error?
it depends on the critical value.
need more info!
WHAT EQUATION HAS
INTERVAL WIDTH, Z CRIT and SE IN IT?
INTEVAL WIDTH = 2 (Z*) (SE)
what does 95% confidence interval mean?
(this is how you interpret the confidence LEVEL)
It means if we took a ton of samples, and made confidence intervals from each of them,ABOUT 95% of the intervals would contain the parameter, 5% would not.
It means 95% of the intervals produced by this process would catch the true parameter.
What does Central Limit Theorem Say?
It basically says.. NO MATTER WHAT SHAPE THE POPULATION IS (normal, bimodal, uniform, skewed, crazy.. ) If you make a histogram of a bunch of means taken from a bunch of samples, that histogram will be unimodal and symmetric WITH LARGE ENOUGH SAMPLES.. Close to normal. So.. A nerdy way to say it is: The sampling distribution of means is approximately normal no matter what the population is shaped like. The larger the sample size, the closer to normal. (the normal curve is just a model.. the sampling distribution is close to it, but not it! we use the model anyway!)
What does the CLT say about the distribution of actual sample data?
Nothing.
The sample will be distributed similar to the population.
The CLT only talks about SAMPLING distributions (histograms) of sample statistics, which are groups of means.., NOT OF INDIVIDUALS!!!! NOT DATA
What does CLT say about the distribution of the population?
Not much.
just that it doesn’t matter what it is..
It talks about sampLING distributions.
With large samples.. The SAMPLING dist will be approx normal (dist of stats.. NOT DATA)
What if you want more cofidence with same size interval?
increase your sample size
What if you want more confidence?
get a bigger net.. (wider conficence interval) or increase sample size
What is “statistically significant” in hypothesis testing?
When p-value is below the alpha, we say “statistically significant”.. Low p-values are statistically significant. When our sample most likely didn’t happen randomly, that is statistically significant.
What is a confidence interval?
it is a parameter catcher.. Like a fishing net.
We stand at our statistic, and reach up and down a margin of error, and hope to CATCH the parameter
sometimes we do, sometimes we don’t
but we never know..
Mooo hooo hooo haaaa haaa haaa (evil laugh)
What is a critical value?
It is the amount of standard errors you’ll reach out, depending on your confidence (a t or z).
Example.. 68% crit z = 1
For 95% crit z = 2 (well, 1.96)..
For means.. Use t crits
What is a margin of error?
critical * s.d.
It is how far you reach out UP AND DOWN a confidence interval..
You reach up and down one of these,
so the interval is actually 2 margins of error wide.
What is a null model?
It is a sampling distribution if the null was true.
It tells us how sample statistics would pile up and vary if the null were true.
It is centered at the null.
what is a parameter?
some numerical summary of a population. Often called “the parameter of interest.” It is what we are often trying to find.. It doesn’t vary. It is out there and STUCK at some value, it is the truth, and you’ll probably not ever know it!
We try to catch them in our confidence intervals, but sometimes we don’t (and we don’t know it!). It Could be the mean of a population, the standard deviation of a population, the proportion of successes in a population, the slope calculated from a population, a difference of 2 means from 2 population, a difference of 2 proportions from population
what is error?
distance from statistic to parameter, how far you sample statistic is off from the truth.
What is a p-value
It is the probability of getting your sample or a stranger one randomly if the null were true. Basically, how likely is it that your sample statistic came from the Null Model.
What is a standard error?
typical distance a statistic is from the parameter. Your expected sampling error. The average distance to the middle in a sampling distribution (pile of stats, not data). Called standard error because it is the typical error you would espect in a sample.
what is a statistic
some numerical summary of a sample.. Could be the mean of a sample, the standard deviation of a sample, the proportion of successes in a sample, the slope calculated from a sample, a difference of 2 means from 2 samples, a difference of 2 proportions from 2 samples, a difference of 2 slopes from 2 samples.. you can make sampling distributions for any of these, and they will all be centered around the parameter…
what is a test statistic?
a Z score, T score, (or chi squared) that you use to find a p value
What is alpha?
***THINK OF ALPHA BETA POWER DIAGRAM****
alpa=P(Type I error)
It is the rejection threshold. You reject p-values below it.. It is how willing you are to make a Type 1 error
What is beta?
***THINK OF ALPHA BETA POWER DIAGRAM****
P(Type II error)
It is probability that you’ll make a Type II error..
what is difference between assumptions and conditions?
Assumptions must be made in order to perform inference. We need to assume independent sample values and a large enough sample (but not too large). We check conditions to help support our assumptions.
What is power?
***THINK OF ALPHA BETA POWER DIAGRAM****
The probability that you correctly rejected a false null.
You detected something that was there.
What is sampling error?
same as sampling variability.
The natural variability between STATISTICS.
NOT DATA!!! .
We call it error EVEN THOUGH YOU MADE NO MISTAKES!!!
What is effect size?
difference between null and true parameter
something we don’t know
(but may be given in a tricky problem)
What is sampling variability?
The natural variation of sample statistics.. NOT DATA.. Samples vary
so do their statistics..
Parameters do not vary!
What is statistical inference?
Using a statistic to infer something about a parameter.. Basically, using a sample to say something about a population.
What is the difference between the distribution of a sample and a sampling distribution?
A distribution of a sample is just a histogram of the DATA in a sample. A sampling distribution is made from an bunch of sample STATISTICS. It is the distribution of the statistic that was calculated from those many many samples.
What is the Fundemental Theorem of Statistics?
The CLT!!
The Central Limit Theorem!
THINK OF Type 1 error?
“BUT I THOUGHT THINGS CHANGED”
or “BUT I THOUGHT IT WORK”
or “BUT I THOUGHT YOU WERE SICK”
THINK OF Type 2 error?
“MISSED OPPORTUNITY”
“YOU ARE SICK, BUT WE MISSED IT”
“THE PROGRAM WORKED, BUT WE DIDN’T NOTICE”
THINK OF Power?
ability to detect change, or to detect what test was designed to detect.
POWER + BETA =
1
When we are looking at differences of proportions, what is the sampling distribution a distribution of?
You have to imagine taking a a pair of samples, say.. Of girls and boys, subtracting phat girl-phat boy, and then writing that difference down. Do this over and over again, and you will have a list of differences. Now make a histogram of that list of differences, and that is your sampling distribution.A PILE OF DIFFERENCES. It is an imagined distribution of an infinite amount of differences (of sample pairs)..
Where did the s.d. of differences of proportions that is on the formula sheet come from?
From the square root of the added variances of the the sampling distributions of the 2 proportions
How do you POOL with MEANS?
YOU DON’T in this class.
Will 95% of other statistics be within my interval?
NO!!! You have no idea where your interval is in regards to true parameter.
If you miraculously got a statistic that was EXACTLY the same as the parameter, then yeah.. but remember.. YOUR STATISTIC IS NOT SPECIAL.. IT IS ONE OF THE RANDOM P-HATS IN THE BIG PILE.. YOU DON’T KNOW WHICH ONE, YOU ARE REACHING OUT TRYING TO CATCH THE MIDDLE OF THE PILE!!
Is a confidence interval a PROBABLILTY?
NO.
Don’t say “chance” or “probability” when interpreting confidence level.
What is difference between population of interest and parameter of interest?
Population is the subjects you are interested in (like goats)
Parameter is the actual number you want (like AVG barks per hour)
when do you need crits?
in confidence intervals
(and old fashioned hyp tests.)
Why does the book use ybar instead of xbar?
I don’t know
What is advantage of pooling?
Pooling allows you to increase your sample size sort of, and decrease your sampling variability. more precise.
It is your best guess at what is in the bucket also.
What is ALPHA + POWER = ?
dunno.
not 1 !!!
Difference between:
interpret confidence interval
interpret confidence level
To interpret interval, you say “I’m 98% confident that more donkeys eat muffins now”
To interpret the LEVEL, you say “This process would make intervals that would catch the true % of donkeys that eat muffins now 98% of the time”
Logic behind rejecting a low p value
We are saying the following:
“This statistic was so unlikely to have happened randomly if the null was true, that I actually don’t think this deviation is random. I think the null is wrong, so I am rejecting it. I think things have changed”
When using intervals to do hypothesis tests:
If the null is in your interval, what can you say and why?
Not enough evidence. Fail to reject. The null is still a possibility based on my interval.
When looking at an inteval of differences of two proportions, what do we look for inside?
We look for ZERO.
If ZERO is in the interval, we are saying we think that the difference may also be ZERO, so basically we are saying there is NO SIGNIFICANT DIFFERENCE!”
Why is it called “standard error?” How is it different form “standard deviation?”
The standard deviation is the typical deviation a data value is from the mean. Like an expected distance to the mean. We are talking about data with standard deviation. The individual data “deviates” from the mean. PILE OF DATA
Standard Error is the typical error a statistic will be from the parameter. Like an expected error from sampling. We are talking about statistics with the standard error. A sample statistic has “error” from the true parameter. PILE OF STATISITCS
When using intervals to do hypothesis tests:
If the null is NOT in your interval, what can you say and why?
You can reject the null.
You are pretty darn sure that it is not an option any more.
Think about it.. You are “95% confident that it is something else!”