CHAPTER 6 Samples, Uncertainty, and Statistical Inference Flashcards

1
Q

What are the three terms that all quantitative estimates consist of?

A

The true quantity of interest, bias, and noise.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does statistical hypothesis testing allow analysts to assess?

A

Whether an estimate was likely to have arisen from noise.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

True or False: Statistical significance and substantive significance are the same.

A

False.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the purpose of using relationships between variables in a sample?

A

To make inferences about relationships in the larger population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the term for the true quantity of interest in a population?

A

Estimand.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is an estimate?

A

The number we get as a result of our analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What symbol is commonly used to denote an estimate?

A

A letter with a hat over it (e.g., q-hat).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the two reasons an estimate can differ from the estimand?

A
  • Bias
  • Noise
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Fill in the blank: Bias refers to errors that occur for ______ reasons.

A

systematic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Fill in the blank: Noise refers to errors that occur because of ______.

A

chance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What analogy is used to explain the difference between bias and noise?

A

Curling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is bias in the context of estimators?

A

A systematic error that causes estimates to differ from the estimand.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does it mean for an estimator to be unbiased?

A

The average value of the estimates it generates equals the estimand.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is sampling variation?

A

Natural variability that results from sampling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the desired quality of a good estimator?

A

It should be both unbiased and precise.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What happens to estimates if the estimator is unbiased but imprecise?

A

Estimates will typically differ from the estimand because of noise.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What occurs if an estimator is biased but precise?

A

Estimates will differ from the estimand because they are estimating the wrong quantity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the relationship between bias and precision in estimators?

A

There can be trade-offs between bias and precision.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What do gray dots represent in the illustration of estimators?

A

Various estimates from repeated applications of a given estimator.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What do black diamonds represent in the illustration of estimators?

A

The estimand— the true value in the world we are interested in.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Fill in the blank: An estimator that yields similar estimates with each iteration is considered ______.

A

precise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What can lead to bias in political polling?

A
  • Voters systematically lie to pollsters
  • Different turnout rates between parties
  • Differences in who is contacted by pollsters
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What does it mean if an estimator is precise?

A

It means there is very little noise, yielding similar estimates with each iteration.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the relationship between bias and precision in estimators?

A

There are trade-offs between bias and precision; sometimes a certain amount of bias is acceptable for a gain in precision.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is a common example of trade-offs between bias and precision?

A

Polling, where a larger convenience sample may be more precise but biased, while a smaller professional sample may be less biased but less precise.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is the standard error?

A

The standard deviation of the sampling distribution, quantifying the precision of an estimator.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What does a large standard error indicate?

A

Estimates are spread out and the estimator is relatively imprecise.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What does a small standard error indicate?

A

Estimates are close together and the estimator is relatively precise.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What factors affect the standard error in polling?

A
  • Sample size (N)
  • True proportion (q) of the population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

How does sample size affect standard error?

A

As sample size increases, standard error decreases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is the relationship between true proportion (q) and standard error?

A

Standard error is minimized when q is very large or very small, as this reduces sampling error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is the implication of diminishing returns in sample size?

A

Increasing the sample size by a factor of 10 improves precision by approximately threefold.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What can lead to misleading conclusions regarding standard error?

A

Small sample sizes or extreme values of q can produce misleading approximations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Why are small towns often overrepresented in lists of extreme cancer rates?

A

Small sample sizes lead to less precision and more variability in estimates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is a confidence interval?

A

A range that estimates the true value with a specified level of confidence, often 95%.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What does the 95% confidence interval represent?

A

If repeated infinitely, the true estimand would lie within the interval 95% of the time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

How does the width of a confidence interval relate to the confidence level?

A

A higher confidence level (e.g., 99%) results in a wider confidence interval compared to a lower level (e.g., 95%).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What is the Law of Large Numbers?

A

As sample size increases, the noise in estimates decreases.

39
Q

What is the Central Limit Theorem?

A

For unbiased polls, approximately 95% of estimates will be within 2 standard errors of the true population value.

40
Q

What is the margin of error in polling?

A

Twice the standard error.

41
Q

What is the relationship between the 95% and 99% confidence intervals?

A

The 99% confidence interval is wider than the 95% confidence interval.

42
Q

What is the primary question of statistical inference?

A

How do we make inferences about populations using estimates from samples?

43
Q

What does hypothesis testing assess?

A

Whether a particular hypothesis is reasonable based on sample data.

44
Q

In hypothesis testing, what is the null hypothesis?

A

The assumption that there is no relationship or difference between groups.

45
Q

What is a p-value?

A

The probability of obtaining a result at least as extreme as the observed result, assuming the null hypothesis is true.

46
Q

What does it mean if a p-value is below a pre-specified threshold (e.g., .05)?

A

We reject the null hypothesis and conclude that there is statistically significant evidence for the alternative hypothesis.

47
Q

True or False: A low p-value indicates that the null hypothesis is true.

48
Q

What is the standard error?

A

A measure of the variability of an estimator, indicating how far the estimate is likely to be from the true parameter.

49
Q

What is the purpose of constructing a confidence interval?

A

To estimate the range within which the true parameter is likely to fall.

50
Q

What does it signify if a regression coefficient is statistically significant?

A

There is evidence that a relationship exists between the explanatory and outcome variables in the population.

51
Q

What is the difference between estimand and estimate?

A

Estimand is the true parameter we want to know; estimate is the calculation derived from sample data.

52
Q

Fill in the blank: Statistical hypothesis testing helps determine if an observed phenomenon is likely due to _______.

53
Q

What is substantive significance?

A

The practical importance of a result, beyond just its statistical significance.

54
Q

What is the Central Limit Theorem’s implication for unbiased polls?

A

95% of estimates will fall within 2 standard errors of the truth.

55
Q

What happens if we conduct hypothesis testing on a population with complete data?

A

We can still assess the likelihood of observing a result under the null hypothesis.

56
Q

What is the significance of testing the null hypothesis that the true relationship between income and education is zero?

A

It helps determine if there is a statistically significant correlation between the two variables.

57
Q

What can be a common error in interpreting p-values?

A

Assuming the p-value indicates the probability that the null hypothesis is true.

58
Q

What do we need to consider when making inferences from sample data?

A

Both bias and noise.

59
Q

True or False: Statistical significance guarantees a large effect size.

60
Q

What is a common threshold for statistical significance?

61
Q

What does the standard error allow us to do with regression estimates?

A

Construct confidence intervals and conduct hypothesis tests.

62
Q

Fill in the blank: Statistical inference can help identify if observed relationships are genuine or simply due to _______.

63
Q

What can statistical inference tell us even with complete population data?

A

Whether observed differences are likely due to chance.

64
Q

What is the question of substantive significance?

A

It asks how much effect marketing has on sales, rather than just whether there is an effect.

65
Q

What was the main finding of the 2012 study published in Nature regarding Facebook and voting?

A

People were more likely to vote if their Facebook pages displayed a banner indicating which of their friends voted.

66
Q

What was the estimated effect of Facebook banners on voter turnout?

A

Less than 0.4 percentage points.

67
Q

What do the researchers conclude about strong ties in social networks?

A

They are instrumental for spreading both online and real-world behavior.

68
Q

What does a large sample size allow researchers to do?

A

Detect genuine relationships more reliably.

69
Q

What did Berlinski and Dewan find regarding the Second Reform Act of 1867?

A

There was little effect on election outcomes despite the doubling of the eligible electorate.

70
Q

What does statistical insignificance imply about the Reform Act’s effects?

A

It does not mean the effects were non-existent; estimates suggested a large effect despite imprecision.

71
Q

What two reasons can cause estimates to differ from estimands?

A
  • Bias
  • Noise
72
Q

What is bias in the context of estimates?

A

Differences between the estimand and estimate that arise for systematic reasons.

73
Q

What is noise?

A

Differences between the estimand and estimate that arise due to idiosyncratic facts about the sample.

74
Q

What is unbiasedness in estimation?

A

An estimate is unbiased if the average of repeated estimates equals the estimand.

75
Q

What is the expected value?

A

The average value of an infinite number of draws of a variable.

76
Q

Define precision in the context of estimation.

A

An estimate is precise if repeated estimations yield similar results.

77
Q

What is a sampling distribution?

A

The distribution of estimates from repeated applications of an estimator on new samples.

78
Q

What is the standard error?

A

The standard deviation of the sampling distribution.

79
Q

What is the margin of error in polling?

A

Typically, the standard error multiplied by 2.

80
Q

What does a 95% confidence interval indicate?

A

The estimand would be contained in the interval 95% of the time if the estimation procedure is repeated.

81
Q

What is hypothesis testing?

A

Statistical techniques for assessing confidence that a data feature reflects a real feature rather than noise.

82
Q

What is the null hypothesis?

A

The hypothesis that a feature of the data is entirely the result of noise.

83
Q

What is statistical significance?

A

Evidence for a hypothesis when the null hypothesis can be rejected at a pre-specified level of confidence.

84
Q

What does a p-value represent?

A

The probability of finding a relationship as strong as or stronger than the observed relationship if the null hypothesis is true.

85
Q

True or False: A p-value indicates the probability that the null hypothesis is true.

86
Q

What is the role of noise in statistical studies?

A

It creates uncertainty and can lead to statistically significant results that are not indicative of real relationships.

87
Q

What can happen if only statistically significant findings are reported?

A

It may lead to systematically incorrect conclusions.

88
Q

What phenomenon can noise create that leads to misinterpretations?

A

Reversion to the mean.

89
Q

What is the population in statistical terms?

A

The units in the world we are trying to learn about.

90
Q

What is a sample?

A

A subset of the population for which we have data.

91
Q

What is an estimand?

A

The unobserved quantity we are trying to learn about with our data analysis.

92
Q

What is an estimator?

A

The procedure applied to data to generate a numerical result.

93
Q

What is an estimate?

A

The numerical result from applying an estimator to a specific set of data.