Path8.Mod1.c - Intro to DevOps Principles for ML - Compute Targets Flashcards
Augmented learning: https://learn.microsoft.com/en-us/azure/machine-learning/concept-compute-target?view=azureml-api-2
These are all the possible Compute Targets for Training:
- Two On-Prem
- Four AML-Dedicated
- Four Azure Data Services
- One Open Source
On-Prem:
* Local Personal Computer
* Remote VMs
AML-Dedicated:
* AML Compute Clusters
* AML Serverless Compute
* AML Compute Instance
* AML Kubernetes
Azure Data Services:
* Azure Databricks
* Azure Data Lake Analytics
* Azure HDInsights
* Azure Batch
Open Source:
* Apache Spark pools (preview)
Machine Learning Pipelines can use any Compute Target except for this one
Your Local Computer LOL~
Automated Machine Learning (AutoML) can use every Compute Target except these four:
- One AML-Dedicated
- Three Azure Data Services
- AML Kubernetes
- Azure Data Lake Analytics
- Azure HDInsight
- Azure Batch
Surprisingly, AutoML can use Azure Databricks for a Compute Target…
CI ApSpP AzDb
Automated Machine Learning (AutoML) can use these three Compute Targets with certain limitations; know the limitation
- Compute Instances, only through the SDK
- Apache Spark Pools (preview), only through SDK local mode
- Azure Databricks, only through SDK local mode
These four are the only Targets that support ML Designer
Only the AML-Dedicated targets!!!
- Computer Clusters
- Serverless Compute
- Computer Instances
- ML Kubernetes
Compute Instances:
- Require this (minimum)
- Are suitable for these kind of Models
- When they run out of disk space (120GB), do this … before you do this…
- Azure ML SDK or CLI v1 minimum
- Small Models less than 1GB in size
- Use the terminal to clear out 1-2 GBs of space … Stop/Restart the Compute Instance
Both Compute Clusters and Compute Instances have these capabilities
Only Compute Clusters have these capabilities
Both:
* Single Node Cluster (Compute Clusters are 1 or more nodes)
* Automatic Cluster Mgmt and Job Scheduling
* Supports CPU and GPU
Compute Clusters Only:
- Multi-Node Clusters
- Autoscaling on each job submission
Three Cost Management options for Compute Targets
- Offload Compute Cycle Management to Serverless Compute (preview)
- Make sure Compute Cluster minimum nodes go back to 0
- Enable Idle Shutdown for Compute Instances
The five Unmanaged Compute Targets
- Remove VMs
- Azure HDInsights
- Azure Databricks
- Azure Data Lake Analytics
- Kubernetes
Surprisingly, Local Compute, Apache Spark Pools and Azure Batch are considered managed…