Attewell Chapter 2 Flashcards

1
Q

Is the statement, “… [the] conventional statistical approach focuses on the individual coefficients
for the predictors, and doesn’t care as much about predictive power” true or false?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Does the likelihood of finding a statistical significance increase as more predictors are entered into a model?

A

Yes! One in twenty predictors will be significant at p ≤ 0.05 through chance alone.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How could a researcher solve the problem of multiplicity?

A

By using a Bonferroni correction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does the Bonferroni correction for multiple comparisons work?

A

By simply dividing the conventional value of

0.05 by the number of predictors (If there are 5 predictors, then the new significance threshold would be 0.01)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

In what field is the problem of multiplicity especially prominent?

A

The problem of multiplicity has grown evermore acute in the medical research and analyses of gene sequences, because it is increasingly common for thousands of significance tests to be tried, before reporting the significant ones.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does the Data Mining approach avoid the multiplicity problem entirely?

A

By using a form of replication known as Cross-Validation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When are “residuals” said to be “homoscedastic”?

A

When they are normally distributed, with a constant variance and a mean of zero, and are independent of one another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does the Greek word “homoscedastic” mean?

A

Having equal variances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does the Greek word “heteroscedastic” mean?

A

Having unequal variances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Does Data Mining provide ways for circumventing the problem of heteroscedasticity?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Are many Data Mining methods nonparametric?

A

Yes, because, “…they do not require the kinds of statistical assumptions about the distribution of error terms that underlie many conventional modeling methods”.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Is bootstrapping a type of nonparametric technique?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Does bootstrapping make assumptions about the shape of the sampling distribution?

A

No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Does significance testing play a crucial role in the conventional statistical approach?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

When do the most serious problems with significance testing occur?

A

When modelers add many predictors to models, and especially when they search through hundreds of predictors before deciding which to include in a model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Can Data Mining models outperform conventional statistical models?

A

Yes, they mostly due, because many conventional statistical models neglect interactions between predictors.