Questions (subset 2) Flashcards

Question 1

Q

You create a batch inference pipeline by using the Azure ML SDK. You run the pipeline by using the following code:

from azureml.pipeline.core import Pipeline from azureml.core.experiment import Experiment pipeline = Pipeline(workspace=ws, steps=[parallelrun_step]) pipeline_run = Experiment(ws, ‘batch_pipeline’).submit(pipeline)

You need to monitor the progress of the pipeline execution.
What are two possible ways to achieve this goal? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

A. Run the following code in a notebook:
B. Use the Inference Clusters tab in Machine Learning Studio.
C. Use the Activity log in the Azure portal for the Machine Learning workspace.
D. Run the following code in a notebook:
E. Run the following code and monitor the console output from the PipelineRun object:

Answer

A

Correct Answer: DE

A batch inference job can take a long time to finish. This example monitors progress by using a Jupyter widget. You can also manage the job’s progress by using:

✑ Azure Machine Learning Studio.

✑ Console output from the PipelineRun object.
from azureml.widgets import RunDetails
RunDetails(pipeline_run).show()
pipeline_run.wait_for_completion(show_output=True)

Reference:
https://docs.microsoft.com/en-us/azure/machine-learning/how-to-use-parallel-run-step#monitor-the-parallel-run-job

Question 2

Q

DRAG DROP -
You have a model with a large difference between the training and validation error values.

You must create a new model and perform cross-validation.

You need to identify a parameter set for the new model using Azure Machine Learning Studio.
Which module you should use for each step? To answer, drag the appropriate modules to the correct steps.

Each module may be used once or more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
Each correct selection is worth one point.
NOTE:
Select and Place:

Answer

A

Box 1: Split data -
Box 2: Partition and Sample -
Box 3: Two-Class Boosted Decision Tree
Box 4: Tune Model Hyperparameters

Question 3

Q

HOTSPOT -
You are evaluating a Python NumPy array that contains six data points defined as follows: data = [10, 20, 30, 40, 50, 60]
You must generate the following output by using the k-fold algorithm implantation in the Python Scikit-learn machine learning library: train: [10 40 50 60], test: [20 30] train: [20 30 40 60], test: [10 50] train: [10 20 30 50], test: [40 60]
You need to implement a cross-validation to generate the output.
How should you complete the code segment? To answer, select the appropriate code segment in the dialog box in the answer area.
Each correct selection is worth one point.
NOTE:
Hot Area:

Answer

A

Box 1: k-fold -
Box 2: 3 -
Box 3: data -

Question 4

Q

DRAG DROP -
You have a model with a large difference between the training and validation error values.
You must create a new model and perform cross-validation.

You need to identify a parameter set for the new model using Azure Machine Learning Studio.

Which module you should use for each step? To answer, drag the appropriate modules to the correct steps.

Each module may be used once or more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.

Each correct selection is worth one point.
NOTE:

Select and Place:

Answer

A

Box 1: Split data -

Box 2: Partition and Sample -
Box 3: Two-Class Boosted Decision Tree
Box 4: Tune Model Hyperparameters

Question 5

Q

HOTSPOT -
You are evaluating a Python NumPy array that contains six data points defined as follows: data = [10, 20, 30, 40, 50, 60]
You must generate the following output by using the k-fold algorithm implantation in the Python Scikit-learn machine learning library: train: [10 40 50 60], test: [20 30] train: [20 30 40 60], test: [10 50] train: [10 20 30 50], test: [40 60]
You need to implement a cross-validation to generate the output.
How should you complete the code segment? To answer, select the appropriate code segment in the dialog box in the answer area.
Each correct selection is worth one point.
NOTE:
Hot Area:

Answer

A

Box 1: k-fold -
Box 2: 3 -
Box 3: data -

Question 6

Q

HOTSPOT -
You plan to preprocess text from CSV files. You load the Azure Machine Learning Studio default stop words list.
You need to configure the Preprocess Text module to meet the following requirements:

✑ Ensure that multiple related words from a single canonical form.
✑ Remove pipe characters from text.
✑ Remove words to optimize information retrieval.

Which three options should you select? To answer, select the appropriate options in the answer area.

Each correct selection is worth one point.

NOTE:
Hot Area:

Answer

A

Box 1: Remove stop words
Box 2: Lemmatization -
Box 3: Remove special characters

Question 7

Q

HOTSPOT -
You are using a decision tree algorithm. You have trained a model that generalizes well at a tree depth equal to 10.
You need to select the bias and variance properties of the model with varying tree depth values.
Which properties should you select for each tree depth? To answer, select the appropriate options in the answer area.
Hot Area:

Answer

A

Bias=Low/High

Variance=High/Low

Question 8

Q

HOTSPOT -
You create an experiment in Azure Machine Learning Studio. You add a training dataset that contains 10,000 rows. The first 9,000 rows represent class 0 (90 percent).
The remaining 1,000 rows represent class 1 (10 percent).
The training set is imbalances between two classes. You must increase the number of training examples for class 1 to 4,000 by using 5 data rows. You add the
Synthetic Minority Oversampling Technique (SMOTE) module to the experiment.
You need to configure the module.
Which values should you use? To answer, select the appropriate options in the dialog box in the answer area.
Each correct selection is worth one point.
NOTE:
Hot Area:

Answer

A

Box 1: 300

Box 2: 5 -

Question 9

Q

DRAG DROP -
You are creating an experiment by using Azure Machine Learning Studio.
You must divide the data into four subsets for evaluation. There is a high degree of missing values in the data. You must prepare the data for analysis.
You need to select appropriate methods for producing the experiment.
Which three modules should you run in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.
NOTE:
Select and Place:

Answer

A

Import data
Clean Missing Data
Partition and Sample

Question 10

Q

HOTSPOT -
You are retrieving data from a large datastore by using Azure Machine Learning Studio.
You must create a subset of the data for testing purposes using a random sampling seed based on the system clock.
You add the Partition and Sample module to your experiment.
You need to select the properties for the module.
Which values should you select? To answer, select the appropriate options in the answer area.
Each correct selection is worth one point.
NOTE:
Hot Area

Answer

A

Hot Area 1 - Partition or sample mode: Sampling

Hot Area 2 - Random seed for sampling: 0

Question 11

Q

site:https://www.examtopics.com/exams/microsoft/dp-100/ “DRAG DROP” OR “HOTSPOT “

Answer

A

https://www.google.com/search?q=site:https://www.examtopics.com/exams/microsoft/dp-100/+%22DRAG+DROP%22+OR+%22HOTSPOT+%22&rlz=1C1GCEA_enSE827SE827&sxsrf=ALeKk00gDVjgGgVxPJEBHJKn6PIaVmshnA:1592228561133&ei=0XrnXrfWB4utrgTIuYGABg&start=0&sa=N&ved=2ahUKEwj3gqbO-YPqAhWLlosKHchcAGA4ChDx0wN6BAgEECo&biw=1920&bih=937

Questions (subset 2) Flashcards

(11 cards)