B10 Improving Performance Flashcards by Nicholas Kouretas

The process of adjusting a model’s parameters to identify the best fit is called _____________.

Parameter tuning

How well did you know this?

Not at all

Perfectly

Automated parameter tuning requires you to consider:

What type of machine learning model (and specific
implementation) should be trained on the data?
Which model parameters can be adjusted, and how
extensively should they be tuned to find the optimal
settings?
What criteria should be used to evaluate the models to find the best candidate?

How well did you know this?

Not at all

Perfectly

The technique of combining and
managing the predictions of multiple
models is known as ____________.

meta-learning

How well did you know this?

Not at all

Perfectly

_______ and ________ are statistics that evaluate the performance of Classification models, while _______ or ________ are used for numeric models.

Accuracy; Kappa

R-squared;RMSE

How well did you know this?

Not at all

Perfectly

Cost-sensitive measures such as _______, _______, and ___________ can also be used to evaluate performance.

sensitivity, specificity, AUC

How well did you know this?

Not at all

Perfectly

The meta-learning approach that
utilizes the principle of creating a varied
team of experts is known as an
_______.

ensemble

How well did you know this?

Not at all

Perfectly

The _________ dictates how much of the training data

each model receives

allocation function

How well did you know this?

Not at all

Perfectly

The __________ governs how disagreements among the predictions are reconciled

combination function

How well did you know this?

Not at all

Perfectly

Some ensembles utilize another model to learn a combination function from various combinations of predictions. This is known as _______.

stacking

How well did you know this?

Not at all

Perfectly

The two main families of ensemble methods are:

1.
2.

Averaging methods

Boosting methods

How well did you know this?

Not at all

Perfectly

Ensemble methods provide a number of performance
advantages over single models:
-___________ to future problems.
- Improved performance on _____ or _______
datasets.
- The ability to synthesize data from distinct
domains.
- A more nuanced understanding of difficult learning
tasks.

Better generalizability

massive or miniscule

How well did you know this?

Not at all

Perfectly

Independently built models with their predictions averaged or combined by a voting scheme. They attempt to reduce the _______ of a ________. Examples include _________ and _________.

variance
single base estimator
Bagging methods
Random Forest

How well did you know this?

Not at all

Perfectly

___________ or __________ is a
technique that generates a number of
training datasets by __________
sampling the original training data.

Bootstrap Aggregating
Bagging
Bootstrap

How well did you know this?

Not at all

Perfectly

In the Bagging process:
1. The training datasets are used to
generate a set of models using a
\_\_\_\_\_\_\_\_\_\_.
2. The models' predictions are
combined using \_\_\_\_\_\_\_ (for
classification) or \_\_\_\_\_\_\_ (for
numeric prediction).

single learner
voting
averaging

How well did you know this?

Not at all

Perfectly

The Random Forest (or Decision
Tree Forest) learner focuses only on
ensembles of decision trees. It
combines the base principles of
\_\_\_\_\_\_\_ with \_\_\_\_\_\_\_\_ to add additional diversity
to decision tree models.

bagging

random feature selection

How well did you know this?

Not at all

Perfectly

Strengths of Random Forest?

Study These Flashcards

-Performs well on most problems.
-Handles noisy or missing data as
well as categorical or continuous
features.
-Selects only the most important
features.
-Works for data with an extremely
large number of features.

Weaknesses of Random Forest?

Study These Flashcards

-Unlike a decision tree, the
model is not easily
interpretable.
-May require some work to tune
the model to the data.
-Increased computational
complexity.

Sequentially built models which are combined to produce a powerful ensemble are referred to as _________.

Study These Flashcards

Boosting Methods

Boosting methods attempt to reduce the _____ of the
________. Examples include AdaBoost and
Gradient Tree Boosting.

Study These Flashcards

bias

combined estimator

Boosting is a technique that
sequentially boosts the performance
of weak learners in order construct a
\_\_\_\_\_\_ classifier as a linear
combination of simple \_\_\_\_
classifiers

Study These Flashcards

strong

weak

At each iteration of the Boosting process:
1. The resampled datasets are
constructed specifically to generate
\_\_\_\_\_\_\_\_\_ learners.
2. Each learner's vote is \_\_\_\_\_\_\_\_\_\_\_
on its past performance

Study These Flashcards

complementary

weight based

The ______________ learner works by sequentially
adding weak models which are trained using weighted
training data.

Each model is assigned a stage value which corresponds to how _______ it is against the training data.

Study These Flashcards

Adaptive Boosting

Accurate

AdaBoost Advantages?

Study These Flashcards

-Boosting is a relatively simple
ensemble method to implement.
-Requires less parameter tuning
compared to other ensemble
methods.
-Can be used with many different
classifiers.

AdaBoost Weaknesses?

Study These Flashcards

-High tendency to overfit with
many weak learners.
-Rather slow training time.
-Sensitive to noisy data and
outliers.

The _____________ learner is an implementation of _________ decision trees designed specifically for speed and performance.

Extreme Gradient Boosting | gradient boosted

With gradient boosting, instead of assigning weights to models at each iteration, subsequent models attempt to predict the _______ of prior models using a gradient descent algorithm to __________.

residuals | minimize loss

B10 Improving Performance Flashcards

Exam Prep (26 cards)