2 - Statistical Inference Flashcards

1
Q

What is a confidence interval?

A

A way of conveying uncertainty about a dataset

95% CI = there is a 95% chance that the actual mean of YOUR dataset is in the interval you define

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Introduce the concept of hypothesis testing.

A

x = average of test group, u = true mean

  1. Define null hypothesis (usually x=u)
  2. Compute a test statistic: t = (x-u)/SE(x)
  3. Draw a conclusion
    - –(if sample size is greater than ~50, reject null hypothesis if t is less than -2 or greater than 2 (5% chance of this)

Type I error:

Type II error:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Interpret a P-value for an effect.

A

p = 0.05 means that the results seen would occur by random chance only 5% of the time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define central limit theorem.

A

A mathematical result stating that for a sufficiently large sample size, the sampling distribution of the mean will be approximately normal regardless of the underlying distribution of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define effect.

A

The magnitude of a difference or relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define event.

A

A clinical outcome of importance
–Ex: onset of a disease (such as cancer or heart disease), onset of a particular symptom (such as bleeding or depression), disease recurrence, or death

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define hypothesis test.

A

A statistical analysis used to accept or reject a null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define null hypothesis.

A

The hypothesis being tested about a population
Null = “no difference;” refers to a situation in which there is no difference (e.g., between the means in a treatment group and a control group)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Define parameter.

A

An unknown summary value for an entire population

The purpose of a statistical analysis is to estimate and make inferences about a parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Define power.

A

The power of a statistical test is the probability that it correctly rejects the null hypothesis when the null hypothesis is false (i.e. the probability of not committing a Type II error)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define p-value.

A

The probability of observing a result as extreme as or more extreme than the one actually observed based on chance alone (i.e., if the null hypothesis is true)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define random sample.

A

A subset of the population obtained by random selection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Define sampling distribution.

A

The theoretical distribution of a statistic obtained from a random sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Define statistical significance level.

A

The probability of making a type I error in a hypothesis test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Define test statistic.

A

The specific statistic used to test the null hypothesis (e.g., the t statistic)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Define type I error.

A

The error that results when one rejects the null hypothesis when it is true or when one concludes that there is a difference when there is none (“false positive”)
–Saying there IS an effect when there isn’t

17
Q

Define type II error.

A

The error that results when one does not reject the null hypothesis when it is false or when one does not detect a difference when there is a difference (“false negative”).
–Saying there is no effect when there is

18
Q

How does standard deviation change from a sample of 100 to 1000? How is standard error different?

A
Standard deviation (SD) will be LARGER in a sample of 1000 vs 100 (because of individual variation)
--SD is a measure of SPREAD in a population, does NOT depend on sample size

Standard error = SD/sqrt(n)

  • -A measure of PRECISION in a sample, depends on N and SD
  • -This is the standard deviation of the AVERAGE rather than the individual
19
Q

Interscalene blocks have a mean duration of 24 hours with an SD of 8 hours.
The duration based on 100 blocks will have a mean duration of ____ and an SE of ____
The duration based on 1000 blocks will have a mean duration of ____ and an SE of ____

A

The duration based on 100 blocks will have a mean duration of 24 and an SE of 8/10

The duration based on 1000 blocks will have a mean duration of 24 and an SE of 8/sqrt(1000) (8/32)

20
Q

Looking at average length of stay (LOS):

  1. First 100 patients: average LOS = 4.2 days, SD = 1.3 days: what do you think is the average length of stay is?
  2. First 1000 patients: average LOS = 4.2 days, SD = 1.3 days: what do you think the average length of stay is?
A
  1. Average = 4.2 days, SE = 1.3 days/sqrt(100) = 0.13 days
    –True value guess = 4.2 +/- (2 x 0.13) days = 4.2 +/- 0.26 days
    (Why multiply by 2? Creating a 95% confidence interval from the normal distribution)
  2. Average = 4.2 days, SE = 1.3 days/sqrt(1000) = 0.04 days
    - -True value guess = 4.2 +/- (2 x 0.04) days = 4.2 +/- 0.08 days (95% CI)
    - -This provides better information
21
Q

Long-standing operative mortality from vascular surgery has averaged 3% at DHMC. Dr. X was recently hired at DHMC. In her last 220 cases, her operative mortality was 6% (SE = 1.6%).

What do you think? Would you send your mother to Dr. X?

A

Null hypothesis is that she is not dangerous

t = (0.06 - 0.03)/(0.016) = 1.875
–This value is

22
Q

T/F: If p > 0.05 or t is greater than 2 or less than -2, you accept the null hypothesis.

A

NO. You say you CANNOT REJECT the null hypothesis. This doesn’t mean it’s true or that the two groups are equal, so you can’t ACCEPT the null hypothesis

23
Q

Proportion dead under new treatment: 42% with SE of 5%
Standard of care = 30%
–What’s the likelihood that this new treatment is more deadly than the standard of care?

A

Null hypothesis: the new treatment has the same death rate as the standard of care

Calculate: t = (42-30)/5 = 2.4, so p = 0.016
–The proportion dead under the new treatment has a likelihood of ~1% of actually being 30% (the same as the standard of care), therefore we can reject the null hypothesis and say that it is more dangerous