DECK 12 INFERENCE MIXED Flashcards
what is a biased estimator?
When the sampling distribution (pile of sample stats, x bars or p hats) is NOT centered on the true population parameter. If you were weighing people and there was a 1 pound weight on the scale, the pile would be centered 1 pound higher. Baised.
What are the three chi-squared models?
goodness of fit, test for homogeneity, test for independence
What is a critical value?
It is the amount of standard errors you’ll reach out, depending on your confidence (a t or z). Example.. 68% crit z = 1 .. For 95% crit z = 2 (well, 1.96).. For means.. Use t crits
you reject when _____________ evidence
you reject when YOU HAVE EVIDENCE
In order to reject a null hypothesis, you need ___________
evidence
What is the missed opportunity error? (the “I didn’t notice” error)
Type 2
What is alpha?
It is the rejection area. Generally, we use .05. The significance level.
What is a sampLING distribution?
a pile of statistics. A pile of p-hats or x-bars.
What is the null model (the sampling distribution) in a 2 sample mean t-test?
a pile of differences of TWO MEANS samples, taken from a bunch of PAIRS of samples. Take two samples, calculate two means, subtract to get a difference, PUT THE DIFFERENCE IN THE PILE.
What is the NULL HYPOTHESIS?
The DULL HYPOTHESIS, the nothing changed hypothesis, the no-difference hypothesis, the “he’s telling the truth” hypothesis, the “No trickery” hypothesis
How is a margin of error different from a standard error?
A margin of error is a NUMBER OF STANDARD ERRORS. It is how far up or down you go in a confidence interval. A standard error tells you about the spread of a pile of statistics (a sampling distribution).
What is the differnce between standard error and standard deviation?
Standard error is the typical distance a STATISTIC is from the mean in a sampling distribution (pile of a bunch of sample’s statistics) and Standard Error is the typical distance a DATUM is from the mean in a pile of raw data.
For both a chi squared test for independence and a regression t test you are looking for an association, how do the null hypotheses differ?
The chi squared will be in words.. Ho: the variables are not associated. The regression t test will be with symbols (and words). Ho: Beta = 0 (the slope is zero) . Saying “beta=0” is the same as saying “there is no association”
when do you need crits?
in confidence intervals (and old fashioned hyp tests.. We look at Z to see if greater than crit.)
you fail to reject when ____________ evidence
you fail to reject when you DON’T HAVE EVIDENCE
notation: what is x-bar
mean of your sample
What is the null for a chi squared test for homogeneity?
The [samples of —] are similarly distributed.
How is a paired T test different from a 2 sample mean T test?
A paired test talks about an AVERAGE OF DIFFERENCES from one list, whereas a 2 sample mean t-test talks about a DIFFERENCE OF AVERAGES between two samples.
when is data “paired”
when you have 2 measurements of the same variable on the same subject (or matched subjects)
What is the null for a 2 sample mean T?
mu1=mu2 OR mu1-mu2=0 there is no diff
When you are doing PAIRED or MATCHED or BLOCKED tests.. What are you finding?
The average difference.. You are doing 1 sample procedures on a NEW THIRD LIST OF DIFFERENCES
What is a confidence interval?
it is a parameter catcher.. Like a fishing net. We stand at our statistic, and reach up and down a margin of error, and hope to CATCH the parameter? sometimes we do, sometimes we don’t? but we never know.. Mooo hooo hooo haaaa haaa haaa (evil laugh)
What does CLT say about the distribution of the population?
Not much… just that it doesn’t matter what it is.. With large samples.. The SAMPLING dist will be approx normal (dist of stats.. NOT DATA)
What is difference between population of interest and parameter of interest?
Population is the WHO (subjects you measure, beads people) Parameter is the actual number you want (like % of or AVG)
what is a parameter?
some numerical summary of a population. Often called “the parameter of interest.” It is what we are often trying to find.. It doesn’t vary. It is out there and STUCK at some value, it is the truth, and you’ll probably not ever know it! We try to catch them in our confidence intervals, but sometimes we don’t (and we don’t know it!). It Could be the mean of a population, the standard deviation of a population, the proportion of successes in a population, the slope calculated from a population, a difference of 2 means from 2 population, a difference of 2 proportions from population
How can you tell if it is a T or a Z procedure?
YES-NO-PROP-Z. Remember t for means, z for proportions. Think of the subjects. Could you get the info in a yes/no fashion? if so, then z-props. Do you need to get a number from each subject? if so, then t-means.
Can you make a 100% confidence interval?
Sure, I’m 100% confident that it will snow between 0 and 500 feet tomorrow.
How do you find Margin of Error from an inteval?
It is half the width.. (HI-LO divided by 2) Remember you stand at statistic (point estimate) and reach up and down a Margin of Error. So an inteval is always exactly 2 margins of error wide)
what does “95% confidence” in a 95% confidence interval mean? (explain the confidence level)
It means if we took a ton of samples, and made confidence intervals from each of them,ABOUT 95% of the intervals would contain the parameter, 5% would not.
sample size calcs FOR PROP AND MEANS
n= (z^2 * p * q )/ (ME ^2) and n = ( t*s / ME) ^ 2 (start with Z then do T)
What does the CLT say about the distribution of actual sample data?
Nothing? The sample will be distributed similar to the population. Bimodal populations have bimodal samples. The CLT only talks about distributions (histograms) of sample statistics, of summaries, which are groups of means.., NOT OF INDIVIDUALS!!!! NOT DATA
Which hypothesis shows what you are trying to prove?
The alternative.
What happens to a pile of statistics if you take larger samples?
All of the x-bars or all of the p-hats will get closer to eachother, and closer to the parameter ( mu or p)
If you are doing a 2 tailed test with alpha=.05.. What confidence interval goes with that?
95% confidence interval (there is .025 in each tail)
who invented the t model?
Bill Gosset, guiness brewing company.
how do you find deg freedom?
n-1 for one sample, for 2 samples you must use calculator. For PAIRED use n-1, REGRESSION IS n-2
What is beta?
It is probability that you’ll make a Type II error.. P(Type II error)
What is “statistically significant?”
When our sample statistic is so far away from what we were expecting that we don’t think that it was due to random sampling error. Then is statistically significant. When p-value is below the alpha, we say “statistically significant”.. Low p-values are statistically significant.
how are t models like Normal models?
both are unimodal and symmetric. T models aren’t as high and have more area in tails, that’s why you have to reach out a little further than z for same confidence.
What is a way to think about the three conditions?
- Sample is random2. Sample is small enough (<10%)3. Sample is large enough (np&nq>10 for props, n>30 for means or the histogram is normalish)EXTRAS: chi squared exp at least 5 in each cell, regression- random resid
What is the null model(the sampling distribution) in a 1-sample mean T test?
A pile of means from a bunch of samples.
notation: what is mu
true population mean (average)
notation: what is z*
critical z, how many SE you are reaching up and down in a confidence interval for proportions
What is the null for a chi squared GOF test?
The distribution fits [the expected distribution]
How do you find point estimate from an interval?
It is in the dead center of interval, so take the average of the upper and lower bounds.
How do you find df in 2 samples?
USE CALCULATOR.(or smaller sample-1). you have to run an interval or a test on your TI and read the output (unless you want to use the equation.)
What if you want more cofidence with same size interval?
increase your sample size
Can you draw the alpha/beta/power diagram?
BE ABLE TO SKETCH THE ALPHA BETA POWER DIAGRAM from the original pregnancy worksheet. Know where everything is. This helps you understand how alpha, beta and power interact.
How else can you explain power?
The likelihood you correctly reject a false null.. The likelihood you correctly detect what you were trying to detect
What is a null model?
It is a sampling distribution. It tells us how sample statistics would vary if the null were true. It is centered at the null. A pile of p-hats or x-bars.
notation: what is t*
critical t, how many SE you are reaching up and down in a confidence interval for means
Describe the distribution of a sample
It will look like the population. The distribution of a sample is a histogram made from the sample, which will look kind of like the population. If the population is bimodal, then the distribution of the sample is bimodal. The SAMPLING distribution of a bunch of means, however, will look normalish.
What is an unbiased estimator?
When the sampling distribution (pile of sample stats) is centered on the true population parameter.
What is the “you think it worked but it didn’t” error?
Type 1
What is a point estimate?
Your p-hat or your x-bar. Your best guess. What you got in your sample. It is in the middle of the interval.
how can you decide the right test? What are the 3 questions?
1 or 2 samples? Proportions (z) or Means (t)? Test or Interval?(YES/NO/PROP/Z)
If the p-value is low, (below alpha), how do you write conclusion?
With p-value this low (show p value < alpha) I reject the null hypothesis. There is strong evidence that the proportion of students who eat rice has changed.
What is a t-crit?
It is the same as z crit. It is the number of sd you reach out in your CI. To find it, do INVT(area in one tail, degrees of freedom)
How are power and alpha related?
they go up and down together
What are we confident in?
our confidence lies in our interval. if we took another sample.. We’d have a different interval..