Renewal1 - Design an ML Training Solution Flashcards

1
Q

Use this service when one of the customizable prebuilt models suits your requirements, to save time and effort

A

Azure AI Services

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Use either of these services if you want to keep all data-related (data engineering and data science) projects within the same service.

A

Azure Synapse Analytics or Azure Databricks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

You’ll need to work with PySpark to use the distributed compute…

Use either of these services if you need distributed compute for working with large datasets (datasets are large when you experience capacity constraints with standard compute).

A

Azure Synapse Analytics or Azure Databricks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Use either of these services when you want full control over model training and management

A

Azure Machine Learning or Azure Databricks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The one you like…

Use this service when Python is your preferred programming language, OR when you want an intuitive user interface to manage your machine learning lifecycle

A

Azure Machine Learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Use this Compute option
with smaller datasets and cost efficiency

A

CPUs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Nvidia…

Use this Compute option with large tabular data, or unstructured data (images, text), where processing may take a lot of time

A

GPUs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Use this Compute option when using Azure Synapse Analytics or Azure Databricks and you want to distribute your workloads

A

Spark

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Use this VM type when you want a balanced CPU-to-memory ratio (Ideal for testing and development with smaller datasets)

A

General Purpose

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Use this VM type when you want a high memory-to-CPU ratio (Great for in-memory analytics, which is ideal when you have larger datasets or when you’re working in notebooks)

A

Memory Optimized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

nodes…

Spark requires Spark-Friendly languages to make optimal use of Spark’s distributed workload capabilities (Scala, SQL, RSpark or PySpark). If you just use Python, what happens?

A

Spark distributes through its worker nodes and communicates between them and you with its driver node. Using Python restricts you to just using the driver node…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Monitoring your training time and compute utilization lets you decide on these things

A
  • compute utilization lets you know to scale up or down
  • long training times indicate a need for GPUs over CPUs, or going distributed

Going distributed make force a rewrite to Spark-Friendly languages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly