Path3.Mod1.e - Automated Machine Learning - Prep & Run AutoML Experiment Code Flashcards

1
Q
  • Order of operations between a Data Asset, an MLTable data asset and AutoML
  • How an MLTable data asset is created
A
  • You need to create the data asset first, then create the MLTable data asset that includes the schema used by AutoML to read that data.
  • When your data is stored in a folder together with an MLTable file.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How to specify the data set as input using Python SDK code

The data must be in this form and must specify a certain column…

A

You need an Input instance and to initialize it with an AssetType and the path to your data asset:

from azure.ai.ml.constants import AssetTypes
from azure.ai.ml import Input

training_data_input = Input(
  type=AssetTypes.MLTABLE,
  path="azureml:input-data-automl:1")

For ML tasks, the data must be in tabular form and include a target column.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Explain what this code is doing:
~~~
from azure.ai.ml import automl

classification_job = automl.classification(
compute=”aml-cluster”,
experiment_name=”auto-ml-class-dev”,
training_data=my_training_data_input,
target_column_name=”Diabetic”,
primary_metric=”accuracy”,
n_cross_validations=5,
enable_model_explainability=True
)
~~~

What my_training_data_input and primary_metric are.

A

This code uses the automl module from the Python SDK v2 to create a classification job instance. Noteable:
- Uses my_training_data_input as the training data source. It should represent an MLTable data asset from your Workspace since AutoML requires one for input.
- Sets the primary_metric to “accuracy”. It’s the target performance metric for which the optimal model will be determined.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Get a list of avaliable metrics to train a classification model

A

Use the ClassificationPrimaryMetrics enum to get a list of them:

from azure.ai.ml.automl import ClassificationPrimaryMetrics
list(ClassificationPrimaryMetrics)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

TM TTM MT EET

Four limits you can set once you instantiate an AutoML experiment or job

A

The four limits you’d set for the job:
* timeout_minutes - int. for terminating the AutoML expermiment
* trial_timeout_minutes - int. max minutes a trial can take
* max_trials - int. max number of trials or models that will be trained
* enable_early_termination - bool. end experiment if score isn’t improving over the short term

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The method you call when you want to set limits on your AutoML job

A

Call the job’s set_limits method:

classification_job.set_limits(
  timout_minutes= 10,
  trial_timeout_minutes= 10,
  max_trials= 5,
  enable_early_termination= true)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Code to submit your AutoML Job

A
// submit the new job
returned_job  = ml_client.jobs.create_or_update(classification_job)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Code to get the url to monitor your job

A
// get the studio url so you can monitor your job
aml_url = returned_job.studio.url

print("Monitor job here:", aml_url)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The method you call when you want to set optional training properties on your AutoML job

A

set the training properties (optional) using the set_training method

classification_job.set_training(
    blocked_training_algorithms=["LogisticRegression"], 
    enable_onnx_compatible_models=True

The above code blocks LogisticRegression from being used for training models and enables ONNX compatible model creation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly