Path3.Mod1.e - Automated Machine Learning - Prep & Run AutoML Experiment Code Flashcards

Question 1

Q

Order of operations between a Data Asset, an MLTable data asset and AutoML
How an MLTable data asset is created

Answer

A

You need to create the data asset first, then create the MLTable data asset that includes the schema used by AutoML to read that data.
When your data is stored in a folder together with an MLTable file.

Question 2

Q

How to specify the data set as input using Python SDK code

The data must be in this form and must specify a certain column…

Answer

A

You need an Input instance and to initialize it with an AssetType and the path to your data asset:

from azure.ai.ml.constants import AssetTypes
from azure.ai.ml import Input

training_data_input = Input(
  type=AssetTypes.MLTABLE,
  path="azureml:input-data-automl:1")

For ML tasks, the data must be in tabular form and include a target column.

Question 3

Q

Explain what this code is doing:

from azure.ai.ml import automl

classification_job = automl.classification(
    compute="aml-cluster",
    experiment_name="auto-ml-class-dev",
    training_data=my_training_data_input,
    target_column_name="Diabetic",
    primary_metric="accuracy",
    n_cross_validations=5,
    enable_model_explainability=True
)

What my_training_data_input and primary_metric are.

Answer

A

This code uses the automl module from the Python SDK v2 to create a classification job instance. Noteable:
- Uses my_training_data_input as the training data source. It should represent an MLTable data asset from your Workspace since AutoML requires one for input.
- Sets the primary_metric to “accuracy”. It’s the target performance metric for which the optimal model will be determined.

Question 4

Q

Get a list of avaliable metrics to train a classification model

Answer

A

Use the ClassificationPrimaryMetrics enum to get a list of them:

from azure.ai.ml.automl import ClassificationPrimaryMetrics
list(ClassificationPrimaryMetrics)

Question 5

Q

TM TTM MT EET

Four limits you can set once you instantiate an AutoML experiment or job

Answer

A

The four limits you’d set for the job:
* timeout_minutes - int. for terminating the AutoML expermiment
* trial_timeout_minutes - int. max minutes a trial can take
* max_trials - int. max number of trials or models that will be trained
* enable_early_termination - bool. end experiment if score isn’t improving over the short term

Question 6

Q

The method you call when you want to set limits on your AutoML job

Answer

A

Call the job’s set_limits method:

classification_job.set_limits(
  timout_minutes= 10,
  trial_timeout_minutes= 10,
  max_trials= 5,
  enable_early_termination= true)

Question 7

Q

Code to submit your AutoML Job

Answer

A

// submit the new job
returned_job  = ml_client.jobs.create_or_update(classification_job)

Question 8

Q

Code to get the url to monitor your job

Answer

A

// get the studio url so you can monitor your job
aml_url = returned_job.studio.url

print("Monitor job here:", aml_url)

Question 9

Q

The method you call when you want to set optional training properties on your AutoML job

Answer

A

set the training properties (optional) using the set_training method

classification_job.set_training(
    blocked_training_algorithms=["LogisticRegression"], 
    enable_onnx_compatible_models=True

The above code blocks LogisticRegression from being used for training models and enables ONNX compatible model creation.

See set_training

Path3.Mod1.e - Automated Machine Learning - Prep & Run AutoML Experiment Code Flashcards

(9 cards)