8.4 Predictive Analytics Data Flow Flashcards

Question 1

Q

What is the first step in the data flow for predictive analytics?

Answer

A

Load the data.

Question 2

Q

After loading the data, what should you do next?

Answer

A

Split the data into training and testing sets.

Question 3

Q

Why is some data set aside and not used in training or tuning?

Answer

A

It’s used as a validation set to ensure unbiased evaluation during hyperparameter tuning.

Question 4

Q

When should you train the model?

Answer

A

After splitting the data and setting aside validation data.

Question 5

Q

What are the three main types of predictive models to choose from?

Answer

A

Regression, Classification, and Clustering.

Question 6

Q

What do you evaluate your model on first?

Answer

A

The training data.

Question 7

Q

What is hyperparameter tuning?

Answer

A

Adjusting settings like number of neighbors in KNN or neurons/layers in ANN to optimize performance.

Question 8

Q

What are examples of hyperparameters in different models?

Answer

A

Question 9

Q

After tuning, what data is used to evaluate the model?

Answer

A

The testing data.

Question 10

Q

Why evaluate using the test data more than once?

Answer

A

To ensure consistent and reliable performance across evaluations.

Question 11

Q

What is the next step after evaluating the model?

Answer

A

Consider whether all errors are equal (evaluate error impact and cost).

(11 cards)