Path4.Mod3.a - Perform Hyperparameter Tuning (Continuous vs Discontinuous) Flashcards

Question 1

Q

The diff between Parameters and Hyperparameters

Answer

A

Parameters are input values derived from training data/features where we use ML to discover relationships between that data.

Hyperparameters are values not derived from training features, used to configure training behavior

Question 2

Q

Define Hyperparameter Tuning
Azure ML uses this kind of job to tune Hyperparameters

Answer

A

Train multiple models using the same algorithm and training data…but different hyperparameter values. Then evaluate each training run to determine your desired performance metric for which you want to optimize
A Sweep Job runs trials for each hyperparam combo to be tested, using a training script with parameterized hyperparam values to train a model, then logs the target metric achieved by that model.

Question 3

Q

Four general steps in a Sweep Job workflow

Answer

A

Create a training script for hyperparam tuning (Jupyter)
Configure and run the Sweep Job by creating a regular Command Job
Calling the Job’s sweep(...) function (don’t forget to call set_limits(...) to control how long the sweeps go for…)
Monitor and Review Sweep Jobs

Question 4

Q

SS SM ET

Three things required for Hyperparameter Tuning

Answer

A

Define a Search Space
Configure a Sampling Method
Configure Early Termination

Question 5

Q

one synonymous with Classification, the other Regression…

Search Space:
- What they are
- The two types of values a hyperparameter could be

Answer

A

A Search Space is a set of values tried during the tuning process.

Types of Values:
- Discrete - the value exists in finite space. Synonymous with Classification (a specific label or range)
- Continuous - the value exists in infinite space along a scale. Synonoumous with Regression (finding a numeric value)

Question 6

Q

G R B

Configure a Sampling Method:
- What a Sweep Job needs one for
- The three types of Sampling

Answer

A

The values used in a Sweep Job depend on the sampling method used, which provides input values based on the sampling technique specified.

The three options for Sampling Method:
- Grid Sampling
- Random Sampling
- Bayesian Sampling

Question 7

Q

m_t and eet from autoML

Configure Early Termination means to stop a Sweep Job based on one of these two conditions.

When (and when NOT) to use an Early Termination Policy

Answer

A

Configure a Sweep Job to stop:
- After a maximum number of trials
- When new Models don’t produce significantly better results

When: Depending on your Search Space and Samplilng Method, Early Termination may be beneficial when working with Continuous Hyperparameters (meaning infinite possible combinations…you don’t want it to go on forever).

When NOT: Conversely, it may be unnecessary to use Early Termination when using Discrete Hyperparameters (limited dimensions == finite set of combinations).

Question 8

Q

Discrete Hyperparameters:
- How to use the Choice function
- What values types it can take
- Example code for using it in a Sweep Job

Answer

A

Choice() is a function from the ML Python SDK that select a random Choice from the given inputs.

It can take:
- csv: batch_size=Choice(values="16, 32, 64"),
- a range object: batch_size=Choice(range(10,20)),
- an arbitrary list object: batch_size=Choice(values=[16, 32, 64]),

Remember that a Sweep Job is just a Job configured to “sweep” , so we still need to create the Job instance:

from azure.ai.ml.sweep import Choice, Normal

command_job_for_sweep = job(
    batch_size=Choice(values=[16, 32, 64]),    # Discrete Hyperparameter
    learning_rate=Normal(mu=10, sigma=3),  # Continuous Hyperparameter
)

Question 9

Q

Discrete Hyperparameters: Hyperparameters can be set to one of four other Discrete Distribution functions

Math: explain what the q parameter is

Answer

A

Four other Discrete Distro functions you can use:
- QUniform(min_value, max_value, q) - Returns a value like round(Uniform(min_value, max_value) / q) * q
- QLogUniform(min_value, max_value, q) - Returns a value like round(exp(Uniform(min_value, max_value)) / q) * q
- QNormal(mu, sigma, q) - Returns a value like round(Normal(mu, sigma) / q) * q
- QLogNormal(mu, sigma, q) - Returns a value like round(exp(Normal(mu, sigma)) / q) * q

q is the “limiting” parameter and what makes each of the above Discrete. Basically acts like a “step” function. So when distributing, you distribute by q-many steps.

Question 10

Q

Continuous Hyperparameters require one of these four methods for defining a Search Space

Answer

A

Four Continuous Distro functions you can use:
- Uniform(min_value, max_value) - uniform distro between min and max
- LogUniform(min_value, max_value) - a value drawn from exp(Uniform) so that the log of the return value is normally distributed
- Normal(mu, sigma) - a real value normally distributed with mean mu and a standard deviation sigma
- LogNormal(mu, sigma)- a value drawn from exp(Normal) so that the log of the return value is normally distributed

Path4.Mod3.a - Perform Hyperparameter Tuning (Continuous vs Discontinuous) Flashcards

(10 cards)