Path4.Mod2.c - Training Models with Scripts - Code to support Experiment Tracking with Jobs using MLFlow Flashcards

Question 1

Q

single comp re-re coll

Benefits of tracking Experiments

Answer

A

All ML experiments organized in a single place (search and filter Experiments)
Compare Experiments, analyze results, debug model w/ little work
Repro or re-run experiments to validate results
Improve collaboration (sharing results, access Experiment data programmatically)

Question 2

Q

Main benefit when using MLFlow for Tracking wrt Azure ML Workspaces

Answer

A

Compatibility with Azure ML Workspaces lets you track runs, metrics, params and artifiacts directly from workspaces (in your Python code, in your Jupyter Notebooks, and ultimately in your production scripts).

Question 3

Q

Pip MLW URI

General prerequisites for using MLFlow
-Two ways to get the URI
- Set the URI
- The mlflow-skinny package use case

Answer

A

Install the mlflow sdk and the Azure ML plugin for MLFlow (pip install mflow azurml-mlflow)
An Azure Machine Learning Workspace
For remote tracking, configure MLFlow to point to your Azure ML Workspace’s tracking URI
– To get the tracking URI:
— SDK Python code: uri = ml_client.workspaces.get(ml_client.workspace_name).mlflow_tracking_uri
— CLI: az ml workspace show --query mlflow_tracking_uri
– To set the tracking URI: mlflow.set_tracking_uri(uri)

mlflow-skinny package* when you only need tracking and logging capabilities*.

Question 4

Q

se en

Configure Experiment name for Notebooks and for Jobs

Answer

A

Notebooks: use exp = mlflow.set_experiment(name)
Jobs through CLI or SDK: in the job yaml set the experiment_name property

Question 5

Q

sr er

Configure Runs (MLFlow terminology for “tracked training jobs”) for Notebooks to start/stop explicitly, and the sigificance wrt when Tracking starts

Answer

A

… training code

To start and end explicitly:

mlflow.start_run()
# ... training code
mlflow.end_run()

Use a Context Manager (aka using in C#):

with mlflow.start_run() as run:
    # ... training code

You can also name the run:

with mlflow.start_run(run_name="my_run") as run:
    # ... training code

Wrt Tracking: Tracking doesn’t start until your code tries to log something.

Question 6

Q

For configuring Runs (MLFlow terminology for “tracked training jobs”) via Command Job
- Three Training Code tasks
- Three MLOps tasks

Answer

A

Training Code:
- Give the Command Job a display_name
- Ensure training code is not using mlflow.start_run(run_name="")
- Add any tracking/logging code using the MLFlow SDK

MLOps:
- Put your training code (.py with a main entry point) in asrc folder
- Ensure your conda.yml installs MLFlow and azureml-MLFlow
- Submit the job

See here for details

Question 7

Q

gr, gmh, a.da

The function to use for accessing or querying metrics though the MLFlow SDK:
- For a single run (how we access the data)
- For all values of a given metric (why this is important)
- For logged artifacts like files and models (what params it needs)

Answer

A

We use mlflow.get_run(), then access its data object:

import mlflow

run = mlflow.get_run(run_id)
metrics = run.data.metrics
params = run.data.params
tags = run.data.tags

print(metrics, params, tags)

The above only returns the last value of the metric. To get the metric’s historical values, use mlflow.get_metric_history(run_id, metric_name):
To get artifacts: mlflow.artifacts.download_artifacts(run_id, artifact_path)

Path4.Mod2.c - Training Models with Scripts - Code to support Experiment Tracking with Jobs using MLFlow Flashcards

(7 cards)