Statistics Flashcards

Cement your grasp of certain fundamental concepts in binary classification and inferential statistics: true and false positives and negatives, positive and negative predictive values, ROC curves and AUC, Bayes' Theorem, and p-values.

You may prefer our related Brainscape-certified flashcards:
1
Q

In an experiment with a high p-value, your data are highly ____ given a true null hypothesis.

A

likely

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

In an experiment with a low p-value, your data are highly ____ given a true null hypothesis.

A

unlikely

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The p-value is the probability of…

A

obtaining an effect AT LEAST as extreme as the one observed assuming that the null hypothesis is true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

A study found a difference between two means with a p-value of 0.02. Interpret this p-value in terms of many repetitions of identical studies.

A

If you repeated the study many times, you would find differences at least as large as observed in this study 2% of the time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The p-value answers what question?

A

How likely are your data given that the null hypothesis is true?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The probability of falsely rejecting a true null hypothesis is called…

A

Type I error = false alarm = false positive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What two factors determine the probabilities of Type 1 and Type 2 errors?

A

The desired level of significance and the power of the test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The probability of falsely accepting a false null hypothesis is called…

A

Type II error = missed detection = false negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Does type 1 error reject or accept the null hypothesis?

A

Type 1 error (odd number) rejects the null hypothesis, an “even” number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Does type 2 error reject or accept the null hypothesis?

A

Type 2 error (even number) accepts the null hypothesis, another even number

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If you commit a type I error, what do you do to the null hypothesis?

A

Type I error = reject the null hypothesis even though it’s actually true = false positive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

If you commit a type II error, what do you do to the null hypothesis?

A

Type II error = accept the null hypothesis even though it’s actually false = false negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

A false positive is also known as what kind of error?

A

Type I: false Positive has one vertical line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

A false negative is also known as what kind of error?

A

Type II: false Negative has two vertical lines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What’s the formula for statistical power in terms of α and/or β?

A

power = 1 - β

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

If an experiment’s probability of type II error increases, then the statistical power ____

A

decreases; power = the ability to correctly reject a false null hypothesis = 1 - β

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

The likelihood that a study will detect an effect when there really is one to be detected is…

A

statistical power

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

A study reports no effect when in fact there was one. What kind of error is this?

A

Type II error = β = false negative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

A study reports an effect when in fact there was no effect. What kind of error is this?

A

Type I error = α = false positive

20
Q

What four factors affect statistical power?

A
  1. effect size
  2. sample size
  3. desired α (type I error)
  4. the chosen or implied β, or equivalently, the statistical power 1 – β

Given any three of these, you can find the fourth.

21
Q

What are the two families of effect size indexes?

A
  1. differences between groups (risk ratio, odds ratio, Cohen’s d, Glass’s delta, etc.)
  2. measures of association (corr coeff r, r^2, Spearman’s rho, Cohen’s f, etc.)
22
Q

T or F: the p-value is the probability of getting a false positive.

A

False! p = probability of seeing at least that big an effect, assuming null is true. It is actually impossible to calculate the probability that the null hypothesis is true solely from sample statistics.

23
Q

In general, which is greater: the probability that the null hypothesis is true, or the p-value?

A

The probability that null is true tends to be greater than the p-value by a large margin.

24
Q

T or F: a confidence interval is a range of values that is likely to contain an unknown population parameter.

A

True

25
Q

If you draw a random sample many times, a certain percentage of the confidence intervals will contain the population mean. What is the name for this percentage?

A

The confidence level

26
Q

T or F: the confidence LEVEL is the probability that a specific confidence interval contains the population parameter.

A

False! For any given study, the confidence interval either contains or does not contain the population parameter of interest.

27
Q

Express confidence level in terms of α and/or β.

A

Confidence level = 1 – α

28
Q

If the confidence interval does not contain your null hypothesis value, what can you say about statistical significance?

A

If CI does not contain the H_0 value, the result is statistically significant.

29
Q

If p < α, what do you know about the confidence interval?

A

If p < α, the confidence interval will not contain the null hypothesis value.

30
Q

If 95% confidence intervals for two independent sample means overlap, could there be a statistically significant difference between them?

A

Yes! Non-overlapping CIs always imply a significant difference. But 95% CIs can sometimes overlap even when p < 0.05.

31
Q

What is the definition of positive predictive value?

A

PPV is the probability that a significant result represents a true effect. In terms of disease screening, PPV = P(disease | positive).

32
Q

In the context of disease screening, what is PPV expressed as a conditional probability?

A

PPV = P( disease | positive).

33
Q

In the context of disease screening, what is NPV expressed as a conditional probability?

A

NPV = P(healthy | negative).

34
Q

In the context of disease screening, what is the sensitivity expressed as a conditional probability?

A

Sensitivity = P(positive | disease).

Recall: sensitivity = true positive rate

35
Q

In the context of disease screening, what is specificity expressed as a conditional probability?

A

Specificity = P(negative | healthy).

Recall: specificity = true negative rate

36
Q

T or F: specificity is synonymous with negative predictive value (NPV).

A

False.

Specificity = P(negative | healthy), whereas
NPV = P(healthy | negative).
37
Q

T or F: sensitivity is the same thing as positive predictive value (PPV).

A

False.

Sensitivity = P(positive | disease), whereas
PPV = P(disease | positive).
38
Q

What does a ROC curve have on its x- and y-axes?

A

x-axis: FPR = 1 – specificity = α
y-axis: TPR = sensitivity

(Sensitivity = true positive rate, and
Specificity = true negative rate)
39
Q

What is another term for the sensitivity of a binary classifier?

A

True positive rate

40
Q

What is another term for the true positive rate of a binary classifier?

A

Sensitivity

41
Q

What is another term for the specificity of a binary classifier?

A

True negative rate

42
Q

What is another term for the true negative rate of a binary classifier?

A

Specificity

43
Q

Given the true positive rate, how do you calculate the false negative rate?

A

TPR = 1 – FNR

Think: true positives (TPR) + positives falsely labeled as negatives (FNR) = 1.

44
Q

How do you calculate the true positive rate if you know the false negative rate?

A

FNR = 1 – TPR

Think: true positives (TPR) + positives falsely labeled as negatives (FNR) = 1.

45
Q

What is the false positive rate in terms of specificity?

A

FPR = 1 – specificity.

FPR = 1 – TNR, since specificity = true negative rate.