MLOps basics | FSDL | Priority Flashcards
Write out a diagram of ML product engineering with a data flywheel.
mlops data-flywheel
(See source material.)
Source: Lecture 1: Course Vision and When to Use ML > But ML-powered products require an outer loop
What are some of the items on a checklist to assess the feasibility of an ML project?
mlops ml-powered-products
(See source material.)
mlops ml-powered-products
Source: Lecture 1: Course Vision and When to Use ML > ML Feasibility Assessment
Draw a diagram of the model-as-service pattern.
mlops ml-powered-products deployment
What are a few of the pros and cons of the model-as-service pattern?
mlops ml-powered-products deployment
Explain the various forms of parallelism in distributed training.
mlops training
- Trivial parallelism: model and data fit on single gpus.
- Data parallelism: model fits on a single gpu but data is spread across gpus; average gradients are computed by the model across gpus
- Model parallelism
a. Sharded data parallelism
i. DeepSpeed, FairScale, fully-sharded data-parallel (pytorch)
ii. shards the optimizer states, the gradients, and the model parameters.
b. Pipelined model parallelism
i. put each layer of your model on each GPU.
c. Tensor parallelism
i. distribute the matrix over multiple GPUs
What takes up gpu memory?
mlops training
Model parameters
Gradients
Optimizer states (statistics about gradients)
Batch of data
Using a cloud provider, how can you minimize costs?
mlops training
Use the most expensive per-hour GPU in the least expensive cloud.
Startups (e.g., Paperspace) tend to be cheaper than major cloud providers.