Path5.Mod1.c - Run Pipelines - Creating and Running a Pipeline Job Flashcards

1
Q

Pipelines run as … while each Component runs as …

A

A pipeline job … a child job as part of the overall pipeline job

Child Jobs are the execution of individual Pipeline Components

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
  • SDK 1 module where pipeline lives
  • SDK 2 module where pipeline lives
  • Module where Workspace lives
  • Module where MLClient lives
A
  • SDK1 - azureml.pipeline.core
  • SDK2 - azure.ai.ml.dsl
  • Workspace lives in azureml.core
  • MLClient lives in azure.ai.ml
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Pipeline YAML files are created in two ways

A

Manually create the YAML file or use the @pipeline() function to create it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

(T/F)
- Components in a Pipeline are always sequential
- Components in a Pipeline can target a specific Compute resource
- To configure a pipeline job, you must pass in your configuration and settings values through the pipeline() annotation.
- When you don’t specify the default_compute value, your job will go into a pending state until you set one.

A
  • False. Can be Sequential or in Parallel
  • True. Allows for different types of processing per task
  • False. Once you have a pipeline job instance, you have Property access to the instance, like output, settings, etc.
  • False. If you don’t specify one, it’ll simply use whatever your default compute actually is…
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain in detail what this code is doing:

from azure.ai.ml.dsl import pipeline

// 1. What's happening here?
@pipeline()
def my_pipeline_function(pipeline_job_input):
    prep_data = loaded_component_prep(input_data=pipeline_job_input)
    train_model = loaded_component_train(training_data=prep_data.outputs.output_data)

    return {
        "pipeline_job_transformed_data": prep_data.outputs.output_data,
        "pipeline_job_trained_model": train_model.outputs.model_output,
    }
		
...
		
from azure.ai.ml import Input
from azure.ai.ml.constants import AssetTypes

// 2. What's happening here?
pipeline_job = my_pipeline_function(
   Input(type=AssetTypes.URI_FILE, path="azureml:data:1")
)

// 3. Why does this work?
print(pipeline_job)

// 4. What's happening here?
pipeline_job_result = ml_client.jobs.create_or_update(
    pipeline_job, experiment_name="pipeline_job"
)
A
  1. The top section of code creates a pipeline job by annotating a function with @pipeline(). This one preps data, trains the model and returns both the data and the model. It uses two loaded Components to accomplish this.
  2. The bottom section of code calls the pipeline function to create a pipeline job. Uses an instance of Input to provide data.
  3. Since the function call returns a formatted YAML file, you can print it out to see the results
  4. Finally, submit the pipeline job
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe each parameter of the pipeline() annotation and any default values:

pipeline(
func=None, `*`, 
name: str | None = None, 
version: str | None = None, 
display _ name: str | None = None,
description: str | None = None, 
experiment_name: str | None = None, 
tags: Dict[str, str] | None = None, 
`**` kwargs)
A
  • func: the function to be annotated
  • name: name of the pipeline component, defaults to the function name
  • version: defaults to 1
  • display_name: defaults to function name
  • description: obvious
  • experiment_name: default is the current directory. Otherwise the name of the experiment the job is created under
  • tags: obvious
  • kwargs: dictionary of additional config params
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Where to view things:
- Both the pipeline job and its child jobs
- Config issues with the pipeline itself
- Config issues with a Component

A

submit job to workspace

  • You can view the submitted job under Job Overview (open the job details to see a designer view of the jobs, then click Job overview button in the top right (see pic)
  • Outputs and logs of the pipeline job
  • Outputs and logs of the individual Child Jobs of the failed Component
How well did you know this?
1
Not at all
2
3
4
5
Perfectly