week 5 & 6 -- Sampling distributions, Standard Error, & CLT Flashcards

1
Q

Ultimate goal of our quest?

A

To find IMPROBABLE data that supports our hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Population distribution

A

The ditribution of a value in our population of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Sample distribution

A

The distribution of that value within the sample we took

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Sample statistic

A

A statistic we calculate from our sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Sampling distribution

A

The theoretical distribution of a sample statistic if we take many more random samples of the same size

simulation of all the distribution of all proportions from all possible samples

NOT distribution of sample (display of actual data collected) but a display of theoretical summary statistics (like p-hat) for many different samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

IMAGINE the results from all the random samples we didn’t take

A

We SEE only the sample that we actually drew, but by simulating or modeling, we can IMAGINE what we might have seen had we drawn other possible random samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Proportion, p

p-hat = observed proportion in a sample

A

PARAMETER!
(not Greek because symbol would be pi, and pi )

the fraction of the total that possesses a certain attribute

proption x 100 = percentage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

q

q-hat

A

the fraction of the total that DOESN’T possess a certain attribute
q = 1 - p

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Proportions come with a freebie!

A

Once we know the mean p, we automatically know the standard deviation
–> as long as n is “large enough”, we can model the distribution of the sample proportions with a Normal model centered at p with a standard deviation of squre of pq/n
(ch 18, page 414)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

letter-hat

A

indicates that the hatted letter – the observed proportion in our data – is our ESTIMATE of the parameter letter (no hat is the probability of having the attribute according to our model)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

theoretical gist of the Normal model

A

If we draw repeated random samples of the same size, n, from some poupalation and measure the proportion, p-hat, we see in each sample, then the collection of these proportions will pile up around the underlying population proportion, p, and that a historgam of the sample proportions can be modeled well by a Normal model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Central Limit Theorem

A

If we take repeated samples of a certain siez and plot the distribution of their means, then for larger and larger samples
- extreme means become rare
middling means become more common (means close to the true population mean)
the distribution of the means becomes more like the Normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Central Limit theorem assumptions (Laplace)

The sampling distribution of ANY mean becomes more nearly Normal as the sample size grows
We don’t even care about the shape of the population distribution
(this is unintuitve, surprising and weird)

A
  • samples must be large
  • observations in sample must be independent
    population distribution must have a well-defined center and spread
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

standard error

A

Sample –> standard deviation
sampling distribution –> standard error

SE (ȳ) = SD (y) / √n,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Normal, log-Normal, Exponential

A

week 6 ????

mean is only a good summary for Normal (not for other too), parameters are different

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Power Law

A

When most observations have small impact but some rare, high-impact observations occur (mean is NOT a good summary)

17
Q

Sampling Distribution model for a proportion – goal

A

an attempt to show the distribution from ALL the random samples

18
Q

think of the sample proportion as a random variable taking on a different value in each random sample

A

then we can say something about the distribution of those values – this is the fundamental insight about statistics! Sampling models are what makes Statistics work! they inform us about the amount of variation we should expect when we sample

The sampling model quantifies the variability, tell us how surprising any sample proportion is.

19
Q

Sampling distribution models act as a bridge from the real world of data to the imaginary model wof the statistic

A

This is the huge leap of statistics: these models allow us to say something about the ENTIRE population when all we have is data from the REAL WORLD SAMPLE

Our data is just a variable – any given value is just one of many we might have seen had we chosen a different random sample

20
Q

centering

A

for proportions: sampling distribution is centere at the population proportion

for means: centered at population mean

21
Q

remember the difference between the real world of data and a magical mathematical model world

A

real: we draw random samples of data (HISTOGRAM)
magic: we describe how the sample means and proportion behave as random variables in all the random samples we might have drawn. ((SAMP. DISTRIB MODEL, Normal based on CLT)