MLOps Flashcards

1
Q

What is MLOps?

A

Set of practices that aims to deploy and maintain ML models in production reliably and efficiently

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is continuous integration?

A

Frequent merging of several small changes into a main branch. Automatically testing each change when you commit or merge them, and automatically kicking off a build.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is continuous delivery?

A

A practice that works in conjunction with continuous integration to automate the process of preparing and releasing code changes to production

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is continuous deployment?

A

A practice that ensures code changes are continuously released into production

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the difference between continuous delivery and continuous deployment?

A

In continuous delivery, the team takes control of deploying new releases to end-users. Continuous deployment is a special case of continuous delivery where the team must ensure the builds passed all tests and automatically deploy them without human intervention.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is MLflow?

A

A platform for managing the end-to-end machine learning lifecycle. It is designed to make it easier for data scientists and machine learning engineers to track, compare, and reproduce machine learning experiments, and to deploy models into production

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the main components of MLflow?

A
  • Experiments
  • Runs
  • Parameters
  • Code
  • Artifacts
  • Model Registry
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are experiments in MLflow?

A

A collection of machine learning runs, each of which represents a single execution of a machine learning model. Experiments can be organized into groups, allowing users to compare and contrast different runs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a run in MLflow?

A

A single execution of a machine learning model. It includes a record of the parameters, code, and results of the model, as well as any artifacts that were generated during the run

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are parameters in MLflow?

A

The settings and configurations that are used to control the behavior of a machine learning model. Parameters are stored as key-value pairs and are associated with a particular run

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How is the code treated in MLflow?

A

It is stored as a versioned artifact, allowing users to reproduce a particular run by executing the same code that was used in the original run

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is an artifact in MLflow?

A

Files or directories that are generated during a machine learning run, such as model files or output files. Artifacts are stored as versioned artifacts, allowing users to retrieve and compare the artifacts from different runs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the model registry in MLflow?

A

A centralized repository for storing and managing machine learning models. It allows users to track the history of a model, including the runs that were used to train and evaluate it, and to deploy the model to production

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a feature store?

A

A feature store pulls data from various sources and transforms the data to features required by the model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a metadata store?

A

It is used to store and manage metadata about machine learning models, such as the training data that was used to build the model, the algorithms that were used, and the performance of the model. It can also be used to store metadata about other machine learning assets, such as feature data, pipelines, and experiments. In MLflow this is called the Tracking Server

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is model monitoring?

A

The practice of monitoring the performance and behavior of machine learning models in production. Model monitoring typically involves collecting and analyzing data about the performance and behavior of machine learning models, such as their accuracy, precision, and recall

17
Q

Why is model monitoring important in machine learning?

A
  • Model drift
  • Model bias
  • Model accountability
18
Q

What is model drift in model monitoring?

A

Degradation or change in performance when models are exposed to different data or environments

19
Q

What is model bias in model monitoring?

A

Models can be biased if they are trained on data that is not representative of the population they are intended to serve

20
Q

What is model accountability in model monitoring?

A

Organizations may be held accountable for the performance and behavior of their machine learning models. Model monitoring can help organizations to demonstrate the performance and behavior of their models, ensuring that they are transparent and accountable

21
Q

What is the difference between data drift and concept drift in model monitoring?

A

Data drift happens when the input data distribution or feature space changes (e.g. a service launched in a new country, expected features becoming NaNs). Model drift happens when the same inputs expect different outputs (e.g. when people searched for Wuhan pre-covid, they expected very different things from what they do now)