B10 Improving Performance Flashcards

Exam Prep

1
Q

The process of adjusting a model’s parameters to identify the best fit is called _____________.

A

Parameter tuning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Automated parameter tuning requires you to consider:

A
  1. What type of machine learning model (and specific
    implementation) should be trained on the data?
  2. Which model parameters can be adjusted, and how
    extensively should they be tuned to find the optimal
    settings?
  3. What criteria should be used to evaluate the models to find the best candidate?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

The technique of combining and
managing the predictions of multiple
models is known as ____________.

A

meta-learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

_______ and ________ are statistics that evaluate the performance of Classification models, while _______ or ________ are used for numeric models.

A

Accuracy; Kappa

R-squared;RMSE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Cost-sensitive measures such as _______, _______, and ___________ can also be used to evaluate performance.

A

sensitivity, specificity, AUC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The meta-learning approach that
utilizes the principle of creating a varied
team of experts is known as an
_______.

A

ensemble

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The _________ dictates how much of the training data

each model receives

A

allocation function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The __________ governs how disagreements among the predictions are reconciled

A

combination function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Some ensembles utilize another model to learn a combination function from various combinations of predictions. This is known as _______.

A

stacking

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The two main families of ensemble methods are:

1.
2.

A

Averaging methods

Boosting methods

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Ensemble methods provide a number of performance
advantages over single models:
-___________ to future problems.
- Improved performance on _____ or _______
datasets.
- The ability to synthesize data from distinct
domains.
- A more nuanced understanding of difficult learning
tasks.

A

Better generalizability

massive or miniscule

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Independently built models with their predictions averaged or combined by a voting scheme. They attempt to reduce the _______ of a ________. Examples include _________ and _________.

A

variance
single base estimator
Bagging methods
Random Forest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

___________ or __________ is a
technique that generates a number of
training datasets by __________
sampling the original training data.

A

Bootstrap Aggregating
Bagging
Bootstrap

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q
In the Bagging process:
1. The training datasets are used to
generate a set of models using a
\_\_\_\_\_\_\_\_\_\_.
2. The models' predictions are
combined using \_\_\_\_\_\_\_ (for
classification) or \_\_\_\_\_\_\_ (for
numeric prediction).
A

single learner
voting
averaging

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q
The Random Forest (or Decision
Tree Forest) learner focuses only on
ensembles of decision trees. It
combines the base principles of
\_\_\_\_\_\_\_ with \_\_\_\_\_\_\_\_ to add additional diversity
to decision tree models.
A

bagging

random feature selection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Strengths of Random Forest?

A

-Performs well on most problems.
-Handles noisy or missing data as
well as categorical or continuous
features.
-Selects only the most important
features.
-Works for data with an extremely
large number of features.

17
Q

Weaknesses of Random Forest?

A
-Unlike a decision tree, the
model is not easily
interpretable.
-May require some work to tune
the model to the data.
-Increased computational
complexity.
18
Q

Sequentially built models which are combined to produce a powerful ensemble are referred to as _________.

A

Boosting Methods

19
Q

Boosting methods attempt to reduce the _____ of the
________. Examples include AdaBoost and
Gradient Tree Boosting.

A

bias

combined estimator

20
Q
Boosting is a technique that
sequentially boosts the performance
of weak learners in order construct a
\_\_\_\_\_\_ classifier as a linear
combination of simple \_\_\_\_
classifiers
A

strong

weak

21
Q
At each iteration of the Boosting process:
1. The resampled datasets are
constructed specifically to generate
\_\_\_\_\_\_\_\_\_ learners.
2. Each learner's vote is \_\_\_\_\_\_\_\_\_\_\_
on its past performance
A

complementary

weight based

22
Q

The ______________ learner works by sequentially
adding weak models which are trained using weighted
training data.

Each model is assigned a stage value which corresponds to how _______ it is against the training data.

A

Adaptive Boosting

Accurate

23
Q

AdaBoost Advantages?

A
-Boosting is a relatively simple
ensemble method to implement.
-Requires less parameter tuning
compared to other ensemble
methods.
-Can be used with many different
classifiers.
24
Q

AdaBoost Weaknesses?

A
-High tendency to overfit with
many weak learners.
-Rather slow training time.
-Sensitive to noisy data and
outliers.
25
Q

The _____________ learner is an
implementation of _________ decision trees
designed specifically for speed and performance.

A

Extreme Gradient Boosting

gradient boosted

26
Q

With gradient boosting, instead of assigning weights to
models at each iteration, subsequent models attempt to
predict the _______ of prior models using a gradient
descent algorithm to __________.

A

residuals

minimize loss