Path5.Mod1.c - Run Pipelines - Creating and Running a Pipeline Job Flashcards

Question 1

Q

Pipelines run as … while each Component runs as …

Answer

A

A pipeline job … a child job as part of the overall pipeline job

Child Jobs are the execution of individual Pipeline Components

Question 2

Q

SDK 1 module where pipeline lives
SDK 2 module where pipeline lives
Module where Workspace lives
Module where MLClient lives

Answer

A

SDK1 - azureml.pipeline.core
SDK2 - azure.ai.ml.dsl
Workspace lives in azureml.core
MLClient lives in azure.ai.ml

See Upgrading Pipelines to SDK 2

Question 3

Q

Pipeline YAML files are created in two ways

Answer

A

Manually create the YAML file or use the @pipeline() function to create it

Question 4

Q

(T/F)
- Components in a Pipeline are always sequential
- Components in a Pipeline can target a specific Compute resource
- To configure a pipeline job, you must pass in your configuration and settings values through the pipeline() annotation.
- When you don’t specify the default_compute value, your job will go into a pending state until you set one.

Answer

A

False. Can be Sequential or in Parallel
True. Allows for different types of processing per task
False. Once you have a pipeline job instance, you have Property access to the instance, like output, settings, etc.
False. If you don’t specify one, it’ll simply use whatever your default compute actually is…

Question 5

Q

Explain in detail what this code is doing:

from azure.ai.ml.dsl import pipeline

// 1. What's happening here?
@pipeline()
def my_pipeline_function(pipeline_job_input):
    prep_data = loaded_component_prep(input_data=pipeline_job_input)
    train_model = loaded_component_train(training_data=prep_data.outputs.output_data)

    return {
        "pipeline_job_transformed_data": prep_data.outputs.output_data,
        "pipeline_job_trained_model": train_model.outputs.model_output,
    }
		
...
		
from azure.ai.ml import Input
from azure.ai.ml.constants import AssetTypes

// 2. What's happening here?
pipeline_job = my_pipeline_function(
   Input(type=AssetTypes.URI_FILE, path="azureml:data:1")
)

// 3. Why does this work?
print(pipeline_job)

// 4. What's happening here?
pipeline_job_result = ml_client.jobs.create_or_update(
    pipeline_job, experiment_name="pipeline_job"
)

Answer

A

The top section of code creates a pipeline job by annotating a function with @pipeline(). This one preps data, trains the model and returns both the data and the model. It uses two loaded Components to accomplish this.
The bottom section of code calls the pipeline function to create a pipeline job. Uses an instance of Input to provide data.
Since the function call returns a formatted YAML file, you can print it out to see the results
Finally, submit the pipeline job

Question 6

Q

Describe each parameter of the pipeline() annotation and any default values:

pipeline(
func=None, `*`, 
name: str | None = None, 
version: str | None = None, 
display _ name: str | None = None,
description: str | None = None, 
experiment_name: str | None = None, 
tags: Dict[str, str] | None = None, 
`**` kwargs)

Answer

A

func: the function to be annotated
name: name of the pipeline component, defaults to the function name
version: defaults to 1
display_name: defaults to function name
description: obvious
experiment_name: default is the current directory. Otherwise the name of the experiment the job is created under
tags: obvious
kwargs: dictionary of additional config params

Question 7

Q

Where to view things:
- Both the pipeline job and its child jobs
- Config issues with the pipeline itself
- Config issues with a Component

Answer

A

submit job to workspace

You can view the submitted job under Job Overview (open the job details to see a designer view of the jobs, then click Job overview button in the top right (see pic)
Outputs and logs of the pipeline job
Outputs and logs of the individual Child Jobs of the failed Component

Path5.Mod1.c - Run Pipelines - Creating and Running a Pipeline Job Flashcards

(7 cards)