Path4.Mod3.a - Perform Hyperparameter Tuning (Continuous vs Discontinuous) Flashcards

1
Q

The diff between Parameters and Hyperparameters

A

Parameters are input values derived from training data/features where we use ML to discover relationships between that data.

Hyperparameters are values not derived from training features, used to configure training behavior

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
  • Define Hyperparameter Tuning
  • Azure ML uses this kind of job to tune Hyperparameters
A
  • Train multiple models using the same algorithm and training data…but different hyperparameter values. Then evaluate each training run to determine your desired performance metric for which you want to optimize
  • A Sweep Job runs trials for each hyperparam combo to be tested, using a training script with parameterized hyperparam values to train a model, then logs the target metric achieved by that model.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Four general steps in a Sweep Job workflow

A
  • Create a training script for hyperparam tuning (Jupyter)
  • Configure and run the Sweep Job by creating a regular Command Job
  • Calling the Job’s sweep(...) function (don’t forget to call set_limits(...) to control how long the sweeps go for…)
  • Monitor and Review Sweep Jobs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

SS SM ET

Three things required for Hyperparameter Tuning

A
  • Define a Search Space
  • Configure a Sampling Method
  • Configure Early Termination
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

one synonymous with Classification, the other Regression…

Search Space:
- What they are
- The two types of values a hyperparameter could be

A

A Search Space is a set of values tried during the tuning process.

Types of Values:
- Discrete - the value exists in finite space. Synonymous with Classification (a specific label or range)
- Continuous - the value exists in infinite space along a scale. Synonoumous with Regression (finding a numeric value)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

G R B

Configure a Sampling Method:
- What a Sweep Job needs one for
- The three types of Sampling

A

The values used in a Sweep Job depend on the sampling method used, which provides input values based on the sampling technique specified.

The three options for Sampling Method:
- Grid Sampling
- Random Sampling
- Bayesian Sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

m_t and eet from autoML

Configure Early Termination means to stop a Sweep Job based on one of these two conditions.

When (and when NOT) to use an Early Termination Policy

A

Configure a Sweep Job to stop:
- After a maximum number of trials
- When new Models don’t produce significantly better results

When: Depending on your Search Space and Samplilng Method, Early Termination may be beneficial when working with Continuous Hyperparameters (meaning infinite possible combinations…you don’t want it to go on forever).

When NOT: Conversely, it may be unnecessary to use Early Termination when using Discrete Hyperparameters (limited dimensions == finite set of combinations).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Discrete Hyperparameters:
- How to use the Choice function
- What values types it can take
- Example code for using it in a Sweep Job

A

Choice() is a function from the ML Python SDK that select a random Choice from the given inputs.

It can take:
- csv: batch_size=Choice(values="16, 32, 64"),
- a range object: batch_size=Choice(range(10,20)),
- an arbitrary list object: batch_size=Choice(values=[16, 32, 64]),

Remember that a Sweep Job is just a Job configured to “sweep” , so we still need to create the Job instance:

from azure.ai.ml.sweep import Choice, Normal

command_job_for_sweep = job(
    batch_size=Choice(values=[16, 32, 64]),    # Discrete Hyperparameter
    learning_rate=Normal(mu=10, sigma=3),  # Continuous Hyperparameter
)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Discrete Hyperparameters: Hyperparameters can be set to one of four other Discrete Distribution functions

Math: explain what the q parameter is

A

Four other Discrete Distro functions you can use:
- QUniform(min_value, max_value, q) - Returns a value like round(Uniform(min_value, max_value) / q) * q
- QLogUniform(min_value, max_value, q) - Returns a value like round(exp(Uniform(min_value, max_value)) / q) * q
- QNormal(mu, sigma, q) - Returns a value like round(Normal(mu, sigma) / q) * q
- QLogNormal(mu, sigma, q) - Returns a value like round(exp(Normal(mu, sigma)) / q) * q

q is the “limiting” parameter and what makes each of the above Discrete. Basically acts like a “step” function. So when distributing, you distribute by q-many steps.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Continuous Hyperparameters require one of these four methods for defining a Search Space

A

Four Continuous Distro functions you can use:
- Uniform(min_value, max_value) - uniform distro between min and max
- LogUniform(min_value, max_value) - a value drawn from exp(Uniform) so that the log of the return value is normally distributed
- Normal(mu, sigma) - a real value normally distributed with mean mu and a standard deviation sigma
- LogNormal(mu, sigma)- a value drawn from exp(Normal) so that the log of the return value is normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly