Path4.Mod1.a - Training Models with Scripts - Run a Training Script as a Command Job Flashcards
Rem Ref T
Three actions for creating a production-ready script
- Remove nonessential code: ex. print statements for stuff you don’t care about in production logs
- Refactor your code to functions: testability, readability
- Test your script in a terminal: Either in the Notebooks page in the studio, choose save and run in the terminal, OR in the Compute Instance select Terminal from the apps then run the script manually from there.
cod com env com d_n e_n
Parameters to Configure a Command
Job, then code to create it
- code: folder path containing the script
- command: specify the script to run as a command line
- environment: necessary packages to install on the Compute Instance before running the Command. Can be an environment image or a file
- compute: Compute Instance to use for running the script/job by name
- display_name: the job name
- experiment_name
You need to create the Command instance, then Create the job via mlclient.create_or_update()
:
from azure.ai.ml import command job = command( code="./src", command="python my_python_script.py --var1 'helloworld'", environment="AzureML-sklearn-0.24-somesystem-cpu@latest", compute="aml-cluster", display_name="train_my_model_job", experiment_name="error_code_model_training") returned_job = ml_client.create_or_update(job) // or alternatively returned_job = ml_client.jobs.create_or_update(job)
Using parameters in your script when running a Command
Job
Since your command is a script, just make sure your script can take parameters, then define those parameters in your Command Job’s command parameter (DUR!!! Commandline bro!):
command="python my_python_script.py --my-training-data.csv --skip-unknown --output-to-log",
How to process parameters into your script in the first place
To actually process params in your script, using a library like argparse
:
import argparse import pandas as pd // Read data as a Pandas Dataframe // from the argument "training_data" def get_dataframe(args): df = pd.read_csv(args.training_data) // Create an instance of an ArgumentParser // Add arguments then parse them def parse_args(): parser = argparse.ArgumentParser() parser.add_argument("--training_data", dest='training_data', type=str) args = parser.parse_args() return args if \_\_name\_\_ == "\_\_main\_\_": args = parse_args() get_dataframe(args)
Converting a Jupyter Notebook into a script in ML Studio
Authoring > Notebooks > Open your Notebook > Hamburger menu at the top left > Export > select Python (.py)
Jupyter Notebooks are ideal for experimentation, scripts are better for production workloads (T/F)
TRUE
When converting Jupyter Notebook to a script, ML Studio gives you an interface to provide params after it parses your script for inputs (T/F)
FALSE. When converting to a file, your code is just dumped into the file. It’s up to you to refactor it
All jobs with the same experiment name will be grouped under the same experiment name (T/F)
TRUE. See image…
Only outputs of a Command Job are tracked, and can be reviewed after the job has completed (T/F)
FALSE. Azure ML will track both inputs AND outputs