Path8.Mod1.a - Intro to DevOps Principles for ML Flashcards

Additional module on MLOps: https://learn.microsoft.com/en-us/training/paths/introduction-machine-learn-operations/

1
Q

Three Processes we combine when discussing MLOps

A
  • ML Workloads: Exploring data analysis, Feature Engineering and Model Training/Tuning
  • Dev: Software development; planning (model reqs and metrics), create (model training and scoring script), verify (check code and model quality) and package (staging the solution)
  • Ops: Release (deploy to prod), Configure (IaC - Infrastructure as Code) and Monitor (Metrics tracking and performance)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Describe CI/CD w.r.t. MLOps

A
  • Continuous Integration: Create and verify activites; Refactoring notebooks to Python code, Linting, unit testing
  • Continuous Delivery: Steps to deploy to staging then prod, automating as much as possible.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

It is preferrable that Training Data is not stored in Source Control. Store it in a database or data lake and retrieve it directly from the source via a Data Asset (T/F)

A

FALSE. Retrieval should be through a Datastore since the Datastore will store all credentials for connecting to the storage source.

Otherwise TRUE you should NOT store your training data in Source Control.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

BRP IRA

DevOps tools in Azure DevOps and GitHub and what we get out of them

A

Azure DevOps:
* Azure Boards - sprint planning and tracking boards
* Azure Repos - code repositories
* Azure Pipelines - CI/CD

GitHub:
* Issues - sprint planning and trackgin boards
* Repos
* Actions - automated workflows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

c ad/gh s d p t n

Best Practices for MLOps Repository Structure (what to put in which folders)

A

.cloud: cloud-specific code (ARM templates, etc)
.ad/.github: Azure DevOps or GitHub artifacts like YAML pipelines to automate workflows.
src: contains any code (Python) used for ML workloads
docs: Markdown files or other documentation describing the project.
pipelines: Azure Machine Learning pipelines definitions.
tests: contains unit and integration tests
notebooks: contains notebooks, mostly used for experimentation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What to use:
- for building Azure ML Pipeline templates
- for Local Development

A

Python SDK or YAML + CLI, and VS Code respectively

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

non func mod unit wkfl

Steps for refactoring Notebooks to production Python code

A
  • Removing all nonessential code
  • Refactoring code into functions
  • Combining related functions in Python scripts to modules (.py files)
  • Creating unit tests for each Python script
  • Create a pipeline to group scripts into workflows that can be automated
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Three ways to create an Azure ML Pipeline (aka Pipeline Job)

What YAML would go into pipeline-job.yml

A
  • Define steps in a Python script (Job-based classes)
  • With Components in Designer
  • YAML using CLI v2 commands

For the pipeline-job.yml approach you’d define:
- type: pipeline
- jobs for transform-job and train-job

How well did you know this?
1
Not at all
2
3
4
5
Perfectly