Module 2 Flashcards by Sidney Rucker

Q

What is the central limit theorem?

A

When “n” is sufficiently large, the sampling distribution for a particular statistic (e.g., sample mean) will tend towards a normal distribution, even if the underlying population distribution is NOT Gaussian. And if the sample size increases, then the distribution of averages becomes more normal and narrower.

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What’s the difference between the distributions of Gaussian observations and sample means.

A

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What is the (95%) confidence interval?

A

95% of the time, any given _X (“x bar”) should be within 2 standard errors (SEs) of the true population mean (mu).

Interpreted as “We are 95% confident that the true mean cholesterol level among persons with MSCD in he population could be as low as 231.8 mg/dL or as high as 268.2 mg/dL.”

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

How do you calculate sample mean and variance?

A

Additional notes for variance: Take every value in sample, subtract the sample mean, square it, then divide by (n-1)

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

How does the mean of the logs relate to the median of the logs?

A

They’re almost equal. As a result, when you exponentiate the mean of log10$ to return to the raw data, you get the median of the raw data (NOT the mean).

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

How does the log of the median relate the median of the logs?

A

They’re equal

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

How does the log of the mean relate to the mean of the logs?

A

They’re not equal

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What greek letter represents the mean with CLT?

A

lowercase mu

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What greek letter represents standard deviation with CLT?

A

lowercase sigma

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What is the standard error (SE) of _X (x bar)?

A

The standard deviation of the sampling distribution of _X

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

How do you find variance?

A

Where sigma = standard deviation of the population
and n = size of sample

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

How do you find standard error?

A

Where sigma = standard deviation of the population
and n = sample size

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

How can the mean of the logs effect the interpretation of the confidence interval?

A

When you exponentiate the mean of log10$ to return to the raw data, you get the median of the raw data (NOT the mean). So in terms of CI interpretation, you’ve found the true median, not the true mean.

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

Does your confidence interval increase of decrease with an increase in sample size?

A

As sample size (n) increases, the confidence interval decreases because your standard error goes down (you’re more sure, think CLT).

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

How do you calculate margin of error (MOE)?

A

2*standard error

How well did you know this?

1

Not at all

2

3

4

5

Perfectly

Q

What happens to the width of the confidence interval as n increases?

Study These Flashcards

A

As sample size (n) increases, the width of the 95% CI decreases.

Q

Remember the question of interest for the module is: How much more did people with MSCD spend on medical care than (otherwise similar) people without MSCD?

How can you actually go about calculating that difference?

Study These Flashcards

A

We estimate that the true value of the difference between population means is _X - _Y (the difference in the sample means). And we can find the 95% confidence interval with (_X-_Y) +/- 2SE(_X-_Y)

We can estimate SE(_X-_Y) using…

Q

How can you calculate the 95% CI for a difference in population means?

Study These Flashcards

A

Q

How do you interpret a 95% CI for a difference in population medians?

Study These Flashcards

A

Q

How do you calculate standard deviation?

Study These Flashcards

A

Q

What is a null hypothesis?

Study These Flashcards

A

A null hypothesis is a statement of no effect, no relationship, or no difference between groups.

Q

How do you perform a hypothesis test?

Study These Flashcards

A

To perform a hypothesis test, we rephrase our question of interest in terms of a precise null hypothesis about the population. When we perform a hypothesis test, we make a decision about this null hypothesis– should we reject it? Or fail to reject it?

Specify a precise null hypothesis about the population. EX: Patients with MSCD in the population do not have higher cholesterol, on average, as compared to patients without MSCD.
Specify the exact outcome variable. EX: Each individual patient’s reported cholesterol level.
Specify the allowable Type 1 error rate (lowercase alpha) of rejecting the null hypothesis when it is in fact true – usually alpha = 0.05 (or 5%) EX: Concluding MSCD patients have higher cholesterol, on average, when in fact there is no difference in cholesterol levels.
Choose an estimator (use the mean for continuous outcome) from your data relevant to the hypothesis. EX: Difference in means comparing cholesterol in patients with MSCD to cholesterol in patients without MSCD.
Calculate a (1 - alpha)*100%, but when a=0.05 it’s 95% CI EX: For the difference in population means with 95% confidence
If the interval does not overlap the value in the null hypothesis, then reject the null hypothesis. If the interval includes the null value, then “fail to reject” the null hypothesis.

Q

What is a Type 1 error?

Study These Flashcards

A

rejecting the null hypothesis when it is true

Q

What is a simulation?

Study These Flashcards

A

Conduct an experiment repeatedly and then summarize the outcomes.

What is a p-value?

A measure of consistency between the null hypothesis and the data. It is the probability (spanning from 0 to 1) of observing a sample relative risk (or some variable) as large as was observed or larger when the null hypothesis is true.

When can you reject the p-value in relation to a selected lowercase alpha-level.

reject the null if p-value is less than a selected a-level. (usually a=0.05)

What is a type 2 error?

False negative.

What is the general form of t-statistic?