Attewell Chapter 3 Flashcards

1
Q

“…searching through data until one finds statistically significant relations” is called ______.

A

Data dredging

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Is the statement: “Researchers should generate their hypothesis before beginning their statistical analyses” True or False

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a group or a random amount of observations known as?

A

the training sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

“Cross-validation can be thought of as a type of quality control for DM models”. True or False?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Does cross-validation require a very large dataset?

A

It does not ~require~ it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

A plot of predicted values against observed values should be a _______ line, if the model is calibrated.

A

straight

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

An uncalibrated model resembles a _______ line.

A

curved

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

“A researcher tries to identify variables that produce the curved pattern, adding those to the regression model in order to correct the curvature” True or False?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

____ refers to the accuracy of a predictive model.

A

fit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

An ideal ROC model “closely follows the Y-axis on the left and then sharply turns parallel to the X axis”. True or False?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Ensemble learning refers to the act of combining several predictive models to provide the best possible prediction. True or False?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Binning treats a dataset as if it were a population, rather than a sample. True or False?

A

False (its Bagging… not Binning!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Averaging several generated tree models to obtain the best prediction refers to random forests. True or False?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Are large datasets enough to allow for a comprehensive/exhaustive search for structure?

A

No… “no big data is big enough”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly