B10 Improving Performance Flashcards
Exam Prep
The process of adjusting a model’s parameters to identify the best fit is called _____________.
Parameter tuning
Automated parameter tuning requires you to consider:
- What type of machine learning model (and specific
implementation) should be trained on the data? - Which model parameters can be adjusted, and how
extensively should they be tuned to find the optimal
settings? - What criteria should be used to evaluate the models to find the best candidate?
The technique of combining and
managing the predictions of multiple
models is known as ____________.
meta-learning
_______ and ________ are statistics that evaluate the performance of Classification models, while _______ or ________ are used for numeric models.
Accuracy; Kappa
R-squared;RMSE
Cost-sensitive measures such as _______, _______, and ___________ can also be used to evaluate performance.
sensitivity, specificity, AUC
The meta-learning approach that
utilizes the principle of creating a varied
team of experts is known as an
_______.
ensemble
The _________ dictates how much of the training data
each model receives
allocation function
The __________ governs how disagreements among the predictions are reconciled
combination function
Some ensembles utilize another model to learn a combination function from various combinations of predictions. This is known as _______.
stacking
The two main families of ensemble methods are:
1.
2.
Averaging methods
Boosting methods
Ensemble methods provide a number of performance
advantages over single models:
-___________ to future problems.
- Improved performance on _____ or _______
datasets.
- The ability to synthesize data from distinct
domains.
- A more nuanced understanding of difficult learning
tasks.
Better generalizability
massive or miniscule
Independently built models with their predictions averaged or combined by a voting scheme. They attempt to reduce the _______ of a ________. Examples include _________ and _________.
variance
single base estimator
Bagging methods
Random Forest
___________ or __________ is a
technique that generates a number of
training datasets by __________
sampling the original training data.
Bootstrap Aggregating
Bagging
Bootstrap
In the Bagging process: 1. The training datasets are used to generate a set of models using a \_\_\_\_\_\_\_\_\_\_. 2. The models' predictions are combined using \_\_\_\_\_\_\_ (for classification) or \_\_\_\_\_\_\_ (for numeric prediction).
single learner
voting
averaging
The Random Forest (or Decision Tree Forest) learner focuses only on ensembles of decision trees. It combines the base principles of \_\_\_\_\_\_\_ with \_\_\_\_\_\_\_\_ to add additional diversity to decision tree models.
bagging
random feature selection
Strengths of Random Forest?
-Performs well on most problems.
-Handles noisy or missing data as
well as categorical or continuous
features.
-Selects only the most important
features.
-Works for data with an extremely
large number of features.
Weaknesses of Random Forest?
-Unlike a decision tree, the model is not easily interpretable. -May require some work to tune the model to the data. -Increased computational complexity.
Sequentially built models which are combined to produce a powerful ensemble are referred to as _________.
Boosting Methods
Boosting methods attempt to reduce the _____ of the
________. Examples include AdaBoost and
Gradient Tree Boosting.
bias
combined estimator
Boosting is a technique that sequentially boosts the performance of weak learners in order construct a \_\_\_\_\_\_ classifier as a linear combination of simple \_\_\_\_ classifiers
strong
weak
At each iteration of the Boosting process: 1. The resampled datasets are constructed specifically to generate \_\_\_\_\_\_\_\_\_ learners. 2. Each learner's vote is \_\_\_\_\_\_\_\_\_\_\_ on its past performance
complementary
weight based
The ______________ learner works by sequentially
adding weak models which are trained using weighted
training data.
Each model is assigned a stage value which corresponds to how _______ it is against the training data.
Adaptive Boosting
Accurate
AdaBoost Advantages?
-Boosting is a relatively simple ensemble method to implement. -Requires less parameter tuning compared to other ensemble methods. -Can be used with many different classifiers.
AdaBoost Weaknesses?
-High tendency to overfit with many weak learners. -Rather slow training time. -Sensitive to noisy data and outliers.
The _____________ learner is an
implementation of _________ decision trees
designed specifically for speed and performance.
Extreme Gradient Boosting
gradient boosted
With gradient boosting, instead of assigning weights to
models at each iteration, subsequent models attempt to
predict the _______ of prior models using a gradient
descent algorithm to __________.
residuals
minimize loss