MLOps Flashcards
In MLFlow, whats the easiest way to track experiments?
mlflow.autolog()
What is a feature store?
A feature store is a centralized repository that enables data scientists to find and share features and also ensures that the same code used to compute the feature values is used for model training and inference.
What are the four data mesh principles?
- federated governance
- data self service
- treating data like a product
- domain first infrastructure
What is MLOps?
MLOps is a set of processes and automation to manage models, data and code to meet the two goals of stable performance and long-term efficiency in ML systems.
What is the ‘deploy code’ approach vs the ‘deploy model’ approach?
These are two approaches to CI/CD for ML models:
Deploy model: The model artifact is trained in the development environment, tested in staging, then deployed into production
Deploy code: The code to train models is developed in the dev environment, and this code is moved to staging and then production: models will be trained in each environment.
This is useful when access to prod data is not possible in lower environments, but data scientists need visibility into training results from production environment.
This is Databricks’ recommended approach, but is use case specific
What is Data Lakehouse architecture?
Unifies the best elements of data lakes and data warehouses — delivering
data management and performance typically found in data warehouses with the low-cost, flexible object
stores offered by data lakes.
Data in the lakehouse are typically organized using a “medallion” architecture of Bronze, Silver and Gold tables of increasing refinement and quality.
What are the three flavours of MLFlow?
MLflow is an open source project for managing the end-to-end machine learning lifecycle.
- Tracking: track experiments to record and compare parameters, metrics and model artifacts.
- Models: store and deploy models from any ML library to a variety of
model serving and inference platforms. - Model Registry: centralized model store for managing models’ full lifecycle stage transitions: from staging to production, with capabilities for versioning and annotating. The registry also provides webhooks for automation and continuous deployment.
What is the Databricks feature store?
The Databricks Feature Store is a centralized repository of features. It enables feature sharing and discovery
across an organization and also ensures that the same feature computation code is used for model training and inference.
What is MLFlow Model Serving?
MLflow Model Serving allows you to host machine learning models from Model Registry as REST endpoints
that are updated automatically based on the availability of model versions and their stages.
What are Databricks workflows and jobs?
Databricks Workflows (Jobs and Delta Live Tables) can execute pipelines in automated, non-interactive
ways.
For ML, Jobs can be used to define pipelines for computing features, training models, or other ML
steps or pipelines.
What should ML integration tests cover?
Integration tests should run all pipelines to confirm that they function correctly together
Feature store tests
Model training tests
Model deployment tests
Model inference tests
Model monitoring tests
What is the balance for integration testing?
Fidelity of testing against speed and cost.
E.g, when models are
expensive to train, it is common to test model training on small data sets or for fewer iterations to reduce
cost.
When models are deployed behind REST APIs, some high-SLA models may need full-scale load testing, whereas others may be tested with small batch jobs or a few queries to temporary REST endpoints.
When should ML models be retrained?
When code or data changes affect upstream featurization or training logic, or when automated retraining is scheduled or triggered
In the ‘deploy code’ approach to MLOps, what are the three key stages in the CD pipeline?
- Compliance checks - These tests load model from the Model Registry, perform compliance checks (for tags, documentation, etc.), and approve or reject the request based on test results. If compliance checks require human
expertise, this automated step can compute statistics or visualizations for people to review in a manual
approval step at the end of the CD pipeline. If pass, model promoted to staging - Compare staging vs. production - All comparison results are saved to metrics tables in the lakehouse.
- Request model transition to production
What is a canary deployment?
The goal of a canary deployment is to minimize risk and ensure the stability of a new software version by gradually rolling it out to a subset of users or systems before making it available to the entire user base.
The “canary,” is initially deployed to a small, representative group of users or a specific subset of infrastructure. This group is typically selected based on certain criteria, such as a specific region, a particular user segment, or a designated set of servers.