Path1.Mod1.f - Explore ML Workspace - MLModel Format Flashcards

Question 1

Q

Difference between Artifacts and Models in MLflow

MLModel Format usage wrt the Model

What’s this code doing?

import mlflow
mlflow.sklearn.log_model(sklearn_estimator, "classifier")

Answer

A

Any file generated and captured from an experiment’s run or job is an Artifact.

Models are a certain type of Artifact; we use the MLModel Format for loading and to communicate Model intention.

Example code for logging a model in MLFlow, using a specific Flavor (sklearn)

Question 2

Q

con/met, man Fl Si

MFlow’s MLModel Format: what it is and where it stores assets

The MLmodel File: what it is and the two sections it uses to describe the model’s usage

Answer

A

MLModel Format
* It’s a contract defining Artifacts and what they represent (like metadata)
* The format stores assets in a folder; one of those assets is named MLmodel

MLmodel File
* It’s the model manifest describing how the model is loaded and used
- Specifies two sections: Flavors and Signatures

Question 3

Q

unicont

Model Flavors:
- what they are
- how they enforce a serialization mechanism for peristing and loading models

Answer

A

The Flavor is the unique contract in MLflow designed to work across all ML frameworks, that indicates what to expect for a given model created with a specific framework (how to persist and load models) i.e. a specific “flavor” of ML Framework.

There is no enforcement of a single serialization mechanism that all models need to support. That decision is left to each flavor, to specify based on each framework’s best practices.

Question 4

Q

Pact meth inf

Model Signatures:
- what they are
- what two subsections they specify
- how MLflow enforces their types

Answer

A

Signatures (ie the API) are the data contract between the model and the server running your models
Signatures aka method signatures specify two subsections: inputs and outputs
MLflow enforces Signatures if one is available during model inference process. This uses a best-effort approach. You can still log models manually if the inferred Signatures are not desired

Question 5

Q

c-b + FRAME, t-b + nd/dict

The two Signature Types and what objects/types are provided to support them

Answer

A

Column-based - Signatures that operate to tabular data (data organized in a Table), using pandas.DataFrame objects as input
Tensor-based - Signatures that operate with n-dimensional arrays (aka tensors). MLflow supplies numpy.ndarray as inputs or a dict[string, numpy.ndarray] for named-tensors

Question 6

Q

con log dep

Model Environment:
- where they are defined
- two ways they are consumed,
- how they are different from Azure ML Environments

Answer

A

The Model Environment
- defined in the conda.yml in the MLModel’s pipeline folder
- consumed when auto-detected by MLflow or manually indicated when calling mlflow.<flavor>.log_model().
- Azure ML Environments apply to Workspaces (for registered Environments) or to Jobs/Deployments (for annonymous Environments). MLflow model Environments are built and used for Model deployment.

Question 7

Q

NCDE!

Model Prediction (predict()) Functions:
- when they are called
- what they return

Answer

A

All MLflow Models have a predict function, called when a Model is deployed using the no-code-deployment experience
What they return depends on the flavor

Question 8

Q

CP/P CP ME CL EH BL V MMD

When to customize Model Prediction (predict()) Functions

Answer

A

Custom Pre/Postprocessing: For models requiring extra data manipulation steps.
Complex Pipelines: To encapsulate multi-step data transformations and models.
Model Ensembling: For managing multiple models used in tandem.
Custom Logging: To capture additional metrics or features during prediction.
Error Handling: For custom responses to prediction errors or data issues.
Business Logic: To apply specific rules or adjustments to predictions.
Versioning: To manage different versions of a model dynamically.
Multi-Model Deployment: For routing requests to different models based on input.

Question 9

Q

R FS R

Models created as MLFlow Models can be loaded back into code from these three different locations

Answer

A

From the run where they were logged
From the file system they were saved on
From the Model registry where they are registered

Question 10

Q

same inf

Two Workflows available for loading Models back:
- diff between flavor.load_model & pyfunc.load_model
- the workflow that guarantees a predict function will take all Signature types

Answer

A

Loading back the same object and types that were logged - using MLflow SDK (mlflow.<flavor>.load_model()) you can obtain an instance of the model with types specific to the training library
Loading back a model for running inference - using MLflow SDK to obtain a wrapper that MLflow guarantees will have a predict function that can be called with parameter value types pandas.DataFrame, numpy.ndarray or dict[string, numpy.ndarray]. Use mlflow.pyfunc.load_model() to handle the type conversion to the expected input type

Question 11

Q

B R-T, Sw TF, PIn, RAI

Four Advantages for logging MLFlow models

Answer

A

Advantages:
* Deploy on batch endpoints or real-time endpoints without a scoring script or Environment
* Auto-generated Swagger and Test features post-deployment
* Models can immediately be used as pipeline inputs
* Access to the Responsible AI Dashboard

Question 12

Q

MLF C Tr

The three types of Models that can be registered in MLFlow

Answer

A

MLFlow - trained in MLFlow
Custom - types not supported by Azure ML
Triton - Deep learning workload models used by Tensorflow and PyTorch

Path1.Mod1.f - Explore ML Workspace - MLModel Format Flashcards

(12 cards)