CHAPTER 7 and 8 VOCAB Flashcards

1
Q

What is the difference between the distribution of a sample and a sampling distribution?

A

A distribution of a sample is just a histogram of the DATA in a sample. A sampling distribution is made from an bunch of sample STATISTICS. It is the distribution of each statistic that was calculated from those many many samples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Are models what really happens?

A

No. A model train is not a real train. We use models to say what kind of happens.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does CLT say about the distribution of the population?

A

just that it doesn’t matter what it is.. With large samples.. The SAMPLING dist will be approx normal (dist of stats.. NOT DATA) for ANY SHAPED population distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the mean and standard deviation of a sampling distribution for a proportion?

A

mean is p and sdandard deviation is root pq/n (look at formula sheet) N(p, root (pq/n) )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does Central Limit Theorem Say?

A

It basically says.. NO MATTER WHAT SHAPE THE POPULATION IS (normal, bimodal, uniform, skewed, crazy.. ) If you make a histogram of a bunch of means taken from a bunch of samples, that histogram will be unimodal and symmetric WITH LARGE ENOUGH SAMPLES.. Close to normal. So A nerdy way to say it is: The sampling distribution of means is approximately normal no matter what the population is shaped like. The larger the sample size, the closer to normal. (the normal curve is just a model.. the sampling distribution is close to it, but not it! we use the model anyway!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is difference between population of interest and parameter of interest?

A

Population is the WHO (subjects you measure, beads, trees, people these are the population) Parameter is the actual number you want (like % of red beads, avg height of trees, or % brown eyes )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does the CLT say about the distribution of actual sample data?

A

Nothing. The sample will be distributed similar to the population. The CLT only talks about samplING distributions, the distributions (histograms) of sample statistics, which are groups of means.., NOT OF INDIVIDUALS!!!! NOT DATA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

N ( ?1 , ?2 ) what does this mean?

A

it means NORMAL model centered at ?1 With a standard deviation of ?2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Describe the distribution of a sample

A

It will look like the population. The distribution of a sample is a histogram made from the sample DATA, which will look kind of like the population. If the population is bimodal, then the distribution of the sample is bimodal. The larger the sample, the more it will look like the population. The SAMPLING distribution of a bunch of means, however, will look normalish.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why is randomization a condition?

A

Because we understand randomness. We study it. We call it probability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do statistics from big samples compare to small? (notice this doesn’t ask about DATA)

A

statistics from larger samples have less variablility, so statistics from them are closer to the parameter and eachother. Statistics from smaller samples are more variable and more likely to be far away from true parameter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Do parameters vary?

A

NO!!! There is only one. Statistics vary. they vary from sample to sample. PARAMETERS DO NOT VARY!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the conditions that have to be met in order to use a normal model for the distribution of sample proportions? (sampling distribution of proportions).. (the distribution of p-hats)..

A
  1. RANDOMIZATION: (this helps with assumption of independence
  2. SMALL ENOUGH SAMPLE: 10% condition (this is the upper limit of our sample size. above this, the sampling distribution starts looking leptokurtic (thinner and taller), not normal)
  3. LARGE ENOUGH SAMPLE. success/failure: np and nq > 10. this is the lower limit of our sample size. This is when the sampling distribution starts looking normal. FOR 2 SAMPLES YOU NEED BOTH SAMPLES TO MEET THESE REQUIREMENTS!
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is a statistic

A

some numerical summary of a sample.. Could be the mean of a sample, the standard deviation of a sample, the proportion of successes in a sample, the slope calculated from a sample, a difference of 2 means from 2 samples, a difference of 2 proportions from 2 samples, a difference of 2 slopes from 2 samples.. you can make sampling distributions for any of these, and they will all be centered around the parameter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is a parameter?

A

some numerical summary of a population. Often called “the parameter of interest.” It is what we are often trying to find.. It doesn’t vary. It is out there and STUCK at some value, it is the truth, and you’ll probably not ever know it! We try to catch them in our confidence intervals, but sometimes we don’t (and we don’t know it!). It Could be the mean of a population, the standard deviation of a population, the proportion of successes in a population, the slope calculated from a population, a difference of 2 means from 2 population, a difference of 2 proportions from 2 populations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the Fundemental Theorem of Statistics? (just the name of it)

A

The CLT.

The Central Limit Theorem!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is sampling variability?

A

same as sampling error.

The natural variation of sample statistics.. NOT DATA.. Samples vary and so do their statistics.. Parameters do not vary.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is sampling error?

A

same as sampling variability..

The natural variability between STATISTICS.. NOT DATA!!! . We call it error EVEN THOUGH YOU MADE NO MISTAKES!!!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is an unbiased estimator?

A

When the sampling distribution (pile of sample stats) is centered on the true population parameter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what is a biased estimator?

A

When the sampling distribution (pile of sample stats) is NOT centered on the true population parameter. Like if you only weighed students in the men’s room to find average weight of all students. That would be a biased estimator. Or if you use the population SD (divide by n) formula when you have a sample. It will underestimate the true parameter. That’s why we divide by n-1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are the mean and standard deviation of a sampling distribution for a mean?

A

mean is mu and standard deviation is sigma/root n (look at formula sheet) N(mu, sigma/rootn)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What if you want more confidence in your interval?

A

get a bigger net. Increasing your confidence make interval wider (or you could increase sample size and keep the same net)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is “statistically significant?”

A

When our observed statistic was so far from what we were expecting that we think something weird is going on. Wow factor. When you are like “WOW, that’s strange” When p-value is below the alpha, we say “statistically significant”.. Low p-values are statistically significant. When our sample most likely didn’t happen randomly, that is statistically significant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

when do you need crits?

A

in confidence intervals (and old fashioned hyp tests.. We look at Z to see if greater than crit.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

what is difference between assumptions and conditions?

A

Assumptions must be made in order to perform inference. We need to assume independent sample values and appropriate sample size. We check conditions to help support our assumptions.

26
Q

how do you find z and t crit?

A

for z crit.. INVNORM(area in 1 tail)

for t crit. INVT(area in 1 tail, deg freedom)

27
Q

What is a margin of error?

A

critical * SE.

It is how far you reach out in a confidence interval.. You reach up and down one of these, so the interval is actually 2 margins of error wide.

28
Q

How many SD wide is a margin of error?

A

It depends on level of confidence. Because you reach up a critical amound of SE and down that much, it is going to be DOUBLE Z CRIT !!

29
Q

How is a confidence interval made?

A

statistic +- margin of error OR THINK Statistic +- (crit * SE). Stand at the statistic, reach out up and down a margin of error, and hope that you catch the parameter.

30
Q

What is a standard error?

A

It is the guessed standard deviation of the pile of statistics. It can be thought of as “the average error” or the typical distance your statistic will be from the true parameter.

The SD of a sampLING distribution.

31
Q

What is a critical value?

A

It is the number of standard errors you’ll reach out, depending on your confidence (t or z). Example.. 68% crit z = 1 .. For 95% crit z = 2 (well, 1.96).. For means.. Use t crits

32
Q

what does 95% confidence interval mean?

A

It means if we took a ton of samples, and made confidence intervals from each of them,ABOUT 95% of the intervals would contain the parameter, 5% would not.

33
Q

What is statistical inference?

A

Using a statistic to infer something about a parameter.. Basically, using a sample to say something about a population.

34
Q

How wide is a confidence interval? (how many ME?)

A

It is 2 margins of error wide ALWAYS

(DON’T CONFUSE WITH NUMBER OF SE)

35
Q

What is a confidence interval?

A

it is a parameter catcher. Like a fishing net. We stand at our statistic, and reach up and down a margin of error, and hope to CATCH the parameter? sometimes we do, sometimes we don’t. but we never know.. Mooo hooo hooo haaaa haaa haaa (evil laugh)

36
Q

Can you make a 100% confidence interval?

A

Sure, I’m 100% confident that it will snow between 0 and 500 feet tomorrow.

37
Q

What is the problem with a large confidence interval?

A

We lose precision. It doesn’t say much. I’m 99.9% confident that between 12% and 86% of people like oreos.

38
Q

Is a confidence interval a PROBABLILTY?

A

NO

39
Q

What are we confident in?

A

our confidence lies in our interval. if we took another sample. We’d have a different interval..

40
Q

What is sample size formula for proportions?

A

n= (z^2 * p * q )/ (ME ^2)

41
Q

Will 95% of other statistics be within my interval?

A

NO!!! You have no idea where your interval is in regards to true parameter or the pile of p-hats or x-bars. You don’t know what statistic you have!

42
Q

What does PANIC A stand for?

A

Parameter, Assumptions, Name the test, Interval, Conclusion in context, Answer the question

43
Q

What if you want more cofidence with same size interval?

A

increase your sample size

44
Q

What are conficence intervals for?

A

PARAMETER CATCHERS. They are an attempt to say what the true population parameter is.. It is our best guess. “We think that there will be between 8 and 12 inches of snow”

45
Q

What are the conditions that have to be met in order to use a t-model for the distribution of sample means? (sampling distribution of means).. (the distribution of x-bars).. REMEMBER THAT 3rd condition has 2 ways to meet it.

A
  1. Randomization (this helps with assumption of independence)
  2. SMALL ENOUGH SAMPLE. 10% condition (this is the upper limit of our sample size. above this, the sampling distribution starts looking leptokurtic (thinner and taller), not normal)
  3. LARGE ENOUGH SAMPLE (n>30) OR NORMALISH.. (n<30) the sample has to be sort of normal-ish. if n>30, odd shapes are fine.
46
Q

For inferences for means, what is the sample size conditions?

A

Either n>30 or

FOR SMALL SAMPLES: It has to look normalish because we need to think it came from a normalish population. Look at histogram of sample.

47
Q

What is a point estimate?

A

Your p-hat or your x-bar.. What you got in your sample. It is your best guess.

48
Q

how do you check nearly normal.

A

Histogram should be unimodalish and symmetricalish,

or normal prob plot on calculator, should be straightish and diagonal.

or Boxplot should be symmetrical

49
Q

what happens to t models as n gets larger?

A

The models look more like the normal model. An infinite sample size would give a t model identical to the normal model.

50
Q

What is a t-crit?

A

It is the same as z crit. It is the number of SE you reach out in your CI. To find it, do INVT(area in one tail, degrees of freedom)

51
Q

What is the normal enough condition?

A

for smaller sample sizes, it must be plausible that the sample may have kind of come from a normalish population.

52
Q

what are the conditions (quick summary) that have to be met for t procedures?

A

independent

random

<10%,

nearly normal or n>30

53
Q

how are t models like Normal models?

A

both are unimodal and symmetric. T models aren’t as tall in the middle and have more area in tails, that?s why you have to reach out a little further than z for same confidence.

54
Q

How do you find Margin of Error from an inteval?

A

It is half the width.. (HI-LO divided by 2) Remember you stand at statistic (point estimate) and reach up and down a Margin of Error. So an inteval is always exactly 2 margins of error wide)

55
Q

how do you find deg freedom?

A

n-1 for one sample,

for 2 samples you must use calculator.

For PAIRED use n-1,

REGRESSION IS n-2

56
Q

sample size formulas FOR PROP AND MEANS

A

n= (z^2 * p * q )/ (ME ^2)

and

n = ( t*s / ME) ^ 2

(start with Z then do T)

57
Q

How do you estimate sample size formula for means?

A

It is a process. Start with a z crit to estimate the t crit.. Then use the corresponding t based on the n you get. Do it again. n = ( t*s / ME) ^ 2

58
Q

How do you find point estimate from an interval?

A

It is in the dead center of interval, so take the average of the upper and lower bounds.

59
Q

who invented the t model?

A

“Wild Bill” Gosset, Guiness brewing company.

60
Q
A