Path3.Mod1.d - Automated Machine Learning - Prep & Run an AutoML Experiment Flashcards

1
Q

During AutoML experimentation, scaling and normalization techniques are applied automatically (T/F)

A

True. It’s AUTO-ML. Multiple scaling and normalization techniques are applied automatically to numerica data, helping to prevent larger features from dominating training.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Once experimentation completes, only the Scaling methods used are available to review in AutoML results (T/F)

A

False. You can review which scaling and normalization methods were applied during experimentation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

AutoML performs Featurization by default, for which you can disable or customize further (T/F)

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

AutoML will not notify you if there are issues with data like missing values or class imbalance since it automatically applies all the transformations necessary to remediate those issues (T/F)

A

False. AutoML will notify you if data issues like missing values or class imbalances are detected through Data Guardrails

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

You can set AutoML to use Ensemble Models for training (T/F)

If Ensemble Models are enabled, AutoML will try both Voting and Stacking combinations (T/F)

A

True if Featurization is enabled.
False. You have to manually enable Stacking.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

MVImp CatE DH-CF FE

Four optional Featurizations you can configure for preprocessing transformation

A
  • Missing Value Imputation (replace null values in the training set)
  • Categorical Encoding (categories to numeric indicators)
  • Dropping High-Cardinality Features (ex. ID fields)
  • Feature Engineering (ex. breaking a DateTime or TimeSpan down to its parts)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

DT ERT GB KNN LGBM LR NB RF SGD XgB

Some Supported Classification Algorithms

A
  • Decision Tree
  • Extremely Randomized Trees
  • Gradient Boosting
  • K-Nearest Neighbors
  • Light GBM
  • Logistic Regression
  • Naive Bayes
  • Random Forest
  • Stochastic Gradient Descent
  • Xgboost
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

DT EN ERT GB KNN LGBM RF SGD XgB

Some Supported Regression Algorithms

A
  • Decision Tree
  • Elastic Net
  • Extremley Randomized Trees
  • Gradient Boosting
  • K-Nearest Neighbors
  • Light GBM
  • Random Forest
  • Stochastic Gradient Descent
  • Xgboost
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

DT EN ES ERT GB KNN LGBM Na RF SA SNa TNCFo

Some Supported Time Series Forecasting Algorithms

A
  • Decision Tree
  • Elastic Net
  • ExponentialSmoothing
  • Extremely Randomized Trees
  • Gradient Boosting
  • K-Nearest Neighbors
  • Light GBM
  • Naive
  • Random Forest
  • SeasonalAverage
  • SeasonalNaive
  • TCNForecaster
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Two reasons to restrict Algorithm selection

A
  • Your data isn’t particularly suited for a type of algorithm
  • Compliance with company policy restrictions on types of machine learning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

AUCW, Acc NMR APSW PSW

The default Primary Metric and the four options available beyond the default

A

Default: AUCWeighted
- Accuracy
- NormMacroRecall
- AveragePrecisionScoreWeighted
- PrecisionScoreWeighted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly