week 5 & 6 -- Sampling distributions, Standard Error, & CLT Flashcards

Question 1

Q

Ultimate goal of our quest?

Answer

A

To find IMPROBABLE data that supports our hypothesis

Question 2

Q

Population distribution

Answer

A

The ditribution of a value in our population of interest

Question 3

Q

Sample distribution

Answer

A

The distribution of that value within the sample we took

Question 4

Q

Sample statistic

Answer

A

A statistic we calculate from our sample

Question 5

Q

Sampling distribution

Answer

A

The theoretical distribution of a sample statistic if we take many more random samples of the same size

simulation of all the distribution of all proportions from all possible samples

NOT distribution of sample (display of actual data collected) but a display of theoretical summary statistics (like p-hat) for many different samples

Question 6

Q

IMAGINE the results from all the random samples we didn’t take

Answer

A

We SEE only the sample that we actually drew, but by simulating or modeling, we can IMAGINE what we might have seen had we drawn other possible random samples

Question 7

Q

Proportion, p

p-hat = observed proportion in a sample

Answer

A

PARAMETER!
(not Greek because symbol would be pi, and pi )

the fraction of the total that possesses a certain attribute

proption x 100 = percentage

Question 8

Q

q

q-hat

Answer

A

the fraction of the total that DOESN’T possess a certain attribute
q = 1 - p

Question 9

Q

Proportions come with a freebie!

Answer

A

Once we know the mean p, we automatically know the standard deviation
–> as long as n is “large enough”, we can model the distribution of the sample proportions with a Normal model centered at p with a standard deviation of squre of pq/n
(ch 18, page 414)

Question 10

Q

letter-hat

Answer

A

indicates that the hatted letter – the observed proportion in our data – is our ESTIMATE of the parameter letter (no hat is the probability of having the attribute according to our model)

Question 11

Q

theoretical gist of the Normal model

Answer

A

If we draw repeated random samples of the same size, n, from some poupalation and measure the proportion, p-hat, we see in each sample, then the collection of these proportions will pile up around the underlying population proportion, p, and that a historgam of the sample proportions can be modeled well by a Normal model

Question 12

Q

Central Limit Theorem

Answer

A

If we take repeated samples of a certain siez and plot the distribution of their means, then for larger and larger samples
- extreme means become rare
middling means become more common (means close to the true population mean)
the distribution of the means becomes more like the Normal distribution

Question 13

Q

Central Limit theorem assumptions (Laplace)

The sampling distribution of ANY mean becomes more nearly Normal as the sample size grows
We don’t even care about the shape of the population distribution
(this is unintuitve, surprising and weird)

Answer

A

samples must be large
observations in sample must be independent
population distribution must have a well-defined center and spread

Question 14

Q

standard error

Answer

A

Sample –> standard deviation
sampling distribution –> standard error

SE (ȳ) = SD (y) / √n,

Question 15

Q

Normal, log-Normal, Exponential

Answer

A

week 6 ????

mean is only a good summary for Normal (not for other too), parameters are different

Question 16

Q

Power Law

Answer

Study These Flashcards

A

When most observations have small impact but some rare, high-impact observations occur (mean is NOT a good summary)

Question 17

Q

Sampling Distribution model for a proportion – goal

Answer

Study These Flashcards

A

an attempt to show the distribution from ALL the random samples

Question 18

Q

think of the sample proportion as a random variable taking on a different value in each random sample

Answer

Study These Flashcards

A

then we can say something about the distribution of those values – this is the fundamental insight about statistics! Sampling models are what makes Statistics work! they inform us about the amount of variation we should expect when we sample

The sampling model quantifies the variability, tell us how surprising any sample proportion is.

Question 19

Q

Sampling distribution models act as a bridge from the real world of data to the imaginary model wof the statistic

Answer

Study These Flashcards

A

This is the huge leap of statistics: these models allow us to say something about the ENTIRE population when all we have is data from the REAL WORLD SAMPLE

Our data is just a variable – any given value is just one of many we might have seen had we chosen a different random sample

Question 20

Q

centering

Answer

Study These Flashcards

A

for proportions: sampling distribution is centere at the population proportion

for means: centered at population mean

Question 21

Q

remember the difference between the real world of data and a magical mathematical model world

Answer

Study These Flashcards

A

real: we draw random samples of data (HISTOGRAM)
magic: we describe how the sample means and proportion behave as random variables in all the random samples we might have drawn. ((SAMP. DISTRIB MODEL, Normal based on CLT)

week 5 & 6 -- Sampling distributions, Standard Error, & CLT Flashcards

(21 cards)