7 Flashcards

Question 1

Q

If you have trained 5 different models with 95% precision - how can you improve these

Answer

A

Combining them with voting ensemble

Question 2

Q

Whats the difference between hard and soft voting classifiers

Answer

A

Hard counts votes and picks the most common

soft computes average probability and picks the highest

Question 3

Q

Can you speed up bagging, pasting, boosting, random forests, or stacking by distributing across multiple servers

Answer

A

Bagging, pasting, and random forests, stacking can be

Question 4

Q

Whats the benefit of out of bag evaluation

Answer

A

It doesn’t require a validation set for accurate testing - meaning more training data can be used

Question 5

Q

Why is extra trees more random, how does it help and is it faster then random forests

Answer

A

They use random thresholds as well as random feature subsets

The added randomness acts as a regularization tool and allows for faster training

Question 6

Q

How can you make adaboost stop underfitting

Answer

A

Increase the number of estimators or reduce regularization hyper parameters or increase learning rate

Question 7

Q

How can you stop Gradient boosting from over fitting - what can be done with learning rate

Answer

A

Decrease learning rate

Early stopping