MLOps Flashcards
What is MLOps?
Set of practices that aims to deploy and maintain ML models in production reliably and efficiently
What is continuous integration?
Frequent merging of several small changes into a main branch. Automatically testing each change when you commit or merge them, and automatically kicking off a build.
What is continuous delivery?
A practice that works in conjunction with continuous integration to automate the process of preparing and releasing code changes to production
What is continuous deployment?
A practice that ensures code changes are continuously released into production
What is the difference between continuous delivery and continuous deployment?
In continuous delivery, the team takes control of deploying new releases to end-users. Continuous deployment is a special case of continuous delivery where the team must ensure the builds passed all tests and automatically deploy them without human intervention.
What is MLflow?
A platform for managing the end-to-end machine learning lifecycle. It is designed to make it easier for data scientists and machine learning engineers to track, compare, and reproduce machine learning experiments, and to deploy models into production
What are the main components of MLflow?
- Experiments
- Runs
- Parameters
- Code
- Artifacts
- Model Registry
What are experiments in MLflow?
A collection of machine learning runs, each of which represents a single execution of a machine learning model. Experiments can be organized into groups, allowing users to compare and contrast different runs
What is a run in MLflow?
A single execution of a machine learning model. It includes a record of the parameters, code, and results of the model, as well as any artifacts that were generated during the run
What are parameters in MLflow?
The settings and configurations that are used to control the behavior of a machine learning model. Parameters are stored as key-value pairs and are associated with a particular run
How is the code treated in MLflow?
It is stored as a versioned artifact, allowing users to reproduce a particular run by executing the same code that was used in the original run
What is an artifact in MLflow?
Files or directories that are generated during a machine learning run, such as model files or output files. Artifacts are stored as versioned artifacts, allowing users to retrieve and compare the artifacts from different runs
What is the model registry in MLflow?
A centralized repository for storing and managing machine learning models. It allows users to track the history of a model, including the runs that were used to train and evaluate it, and to deploy the model to production
What is a feature store?
A feature store pulls data from various sources and transforms the data to features required by the model
What is a metadata store?
It is used to store and manage metadata about machine learning models, such as the training data that was used to build the model, the algorithms that were used, and the performance of the model. It can also be used to store metadata about other machine learning assets, such as feature data, pipelines, and experiments. In MLflow this is called the Tracking Server