Module 5 - Manley Ch 6 - Generalizations Flashcards

1
Q

What is a statistical inference?

A

Using specific observations as evidence for general claims about a larger group, or vice versa.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the difference between a statistical generalization and a statistical instantiation?

A

Generalization: Moves from a sample to conclusions about the whole population.

Instantiation: Moves from known facts about the population to conclusions about a sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why is forming appropriate generalizations difficult?

A

It requires careful sampling, avoiding biases, and understanding probability, which is why the field of statistics exists.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When does a sample provide strong evidence for a hypothesis?

A

When the likelihood of observing the sample is much greater if the hypothesis is true than if it is false.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What factors weaken the reliability of a sample?

A

Small sample size

Sampling bias (e.g., non-random or convenience sampling)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is sampling bias?

A

A selection effect where the method of choosing the sample skews the results, making them unrepresentative of the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How can experiments be designed to reduce sampling bias?

A

By ensuring the sample is random and representative of the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why is a larger sample size important?

A

It increases the likelihood that the sample’s proportions closely resemble those of the whole population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the law of large numbers?

A

The principle that larger samples tend to reflect the true distribution of the population more accurately.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why do smaller samples often have more extreme proportions?

A

Because random variation has a larger effect on smaller groups, leading to greater deviations from the population average.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Why might small counties show extreme rates of a disease compared to larger ones?

A

Small samples are more prone to random fluctuations, leading to unusually high or low rates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Why are the hospitals or schools with the “best” or “worst” rates often small ones?

A

Small sample sizes magnify the effects of random variation, making extreme outcomes more likely.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What should you consider when interpreting extreme results in small samples?

A

Whether the results could be explained by random variation rather than true differences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do you test the strength of evidence?

A

By comparing the likelihood of observing the evidence if the hypothesis is true versus if it is false.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why is it important to design experiments where evidence strongly distinguishes between hypotheses?

A

To ensure that the observations are far more likely under one hypothesis than the other, making the evidence meaningful.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is stratified sampling?

A

A method of sampling where the sample is divided into subgroups (strata) that match the population’s proportions for specific characteristics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Why does sample size matter in statistical generalizations?

A

Larger samples are more likely to reflect the true characteristics of the population and reduce the margin of error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does a 95% confidence interval mean?

A

If the true population value lies outside the interval, the observed result would occur only 5% of the time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Why is random sampling important?

A

It minimizes selection effects and ensures that the sample is representative of the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are some common issues with non-random sampling methods?

A

Oversampling certain groups (e.g., cars on low-traffic roads).

Failing to account for relevant subgroups.

Bias from convenience sampling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How does stratified sampling reduce bias?

A

By ensuring the sample mirrors the population in terms of characteristics like age, gender, or car type.

22
Q

Why does stratification not replace randomization?

A

There may be unconsidered subgroups or hidden biases, so random sampling within strata is still essential.

23
Q

What is participation bias in surveys?

A

When certain groups are more likely to participate than others, skewing the sample.

24
Q

How can offering cash incentives for survey participation still lead to bias?

A

The fixed amount may motivate some groups more than others, leaving the sample unrepresentative.

25
Q

What is response bias?

A

When participants give answers that are socially acceptable, avoid embarrassment, or reflect ignorance.

26
Q

How can the wording of a survey question introduce bias?

A

Different terms or phrasings can evoke different emotional responses, leading to skewed results (e.g., “death tax” vs. “estate tax”).

27
Q

What should you do if you cannot eliminate all selection effects in sampling?

A

Use randomization and stratification to minimize bias as much as possible.

28
Q

Why are voluntary surveys particularly prone to bias?

A

They attract participants with strong opinions, skewing results toward those who care most about the topic.

29
Q

What are the two key questions we must answer when summarizing statistical data?

A

What features of the data are most important to us?

What’s the clearest way to present those features?

30
Q

Why can statistical summaries be misleading?

A

They omit some facts and can be selectively presented to make true but misleading claims.

31
Q

What are the three main measures of central tendency?

A

Mean, median, and mode.

32
Q

When is the mean most useful?

A

When calculating the total of a quantitative feature or when outliers do not significantly skew the data.

33
Q

Why might the median be preferred over the mean?

A

The median resists being skewed by outliers and better reflects the “typical” value in some cases.

34
Q

What is the mode?

A

The value that appears most frequently in a dataset.

35
Q

What is an outlier?

A

A data point that is significantly different from other values in the dataset.

36
Q

How does a truncated mean handle outliers?

A

It calculates the mean after excluding extreme outliers.

37
Q

Why is visualizing the shape of data important?

A

It helps identify patterns, distributions, and inequalities that measures like the mean or median cannot capture.

38
Q

What does the standard deviation measure?

A

The average distance of data points from the mean, giving a sense of variability in the dataset.

39
Q

What is cherry-picking data, and why is it misleading?

A

Selecting specific data points to create a false impression of a trend while ignoring the full dataset.

40
Q

What are loose generalizations?

A

Vague or unclear generalizations, often expressed as “Most Fs are Gs” or “Many Fs are Gs,” without clear evidence or definition.

41
Q

What is a stereotype?

A

A widely held loose generalization about a social group, often influenced by in-group bias.

42
Q

Why are loose generalizations problematic?

A

They can smuggle false or misleading ideas under the guise of truth and are often vague enough to evade scrutiny.

43
Q

How can even true generalizations be misleading?

A

They might imply a causal or explanatory relationship where none exists, confusing the true cause of a phenomenon.

44
Q

Give an example of a true but misleading generalization.

A

“Older women are dangerous drivers.” While true on average, it misrepresents the real cause: visual and cognitive decline that affects some seniors, regardless of gender.

45
Q

What is the representativeness heuristic?

A

A cognitive shortcut where people estimate probabilities based on how strongly two features are associated in their minds, rather than actual statistical relationships.

46
Q

What is an example of the representativeness heuristic in action?

A

Judging “Linda is a bank teller and active in the feminist movement” as more probable than “Linda is a bank teller,” despite it being statistically impossible.

47
Q

What is base rate neglect?

A

Ignoring the general prevalence of events or conditions when assessing probabilities.

48
Q

How does base rate neglect affect decision-making?

A

It leads to errors by focusing on specific details or similarities while ignoring how common each possibility is overall.

49
Q

How can we make better generalizations?

A

Ensure clarity by defining terms and specifying proportions.

Avoid causal implications unless they are supported by evidence.

Consider base rates and broader context before drawing conclusions.

50
Q

Why is clarity important in generalizations?

A

It prevents vague or misleading interpretations and forces accountability for claims.