Lecture 13 - Single Global Part 1 Flashcards

1
Q

What is Hill Climbing with Restarts?

A

This is a global optimisation strategy that extends basic hill climbing by:
* Running multiple hill climbs, each from a new random starting point
* Each hill climb runs for a random amount of time, drawn from a distribution T
Instead of committing to one trajectory, you probe different areas of the search space, trying to avoid getting stuck in bad local optima.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Algorithm 10 - Hill Climbing with Restarts

A

REFER TO NOTES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the advantages/disadvantages of Hill Climbing with Restarts

A

Advantages:
Escapes local minima
Balances exploration and exploitation

Disavantages:
Requires time budgeting
Wasteful is restart logic is poor
No learning from past run

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the limitations Uniform Tweaks?

A

REFER TO NOTES -> Uses Algorithm 8
No change of bigger jumps - bounded by r, can never take bigger jumps than r
Exploration is capped
Locality is compromised

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are Gaussian Tweaks?

A

Gaussian tweaks are a type of probabilistic modification applied to candidate solutions in stochastic optimisation, where the change (or “tweak”) is sampled from a Gaussian (Normal) distribution cantered at the current value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the purpose of Gaussian Tweaks?

A
  • Models realistic, gradual improvement in solution space.
  • Captures the idea that small changes are more likely to be useful, but large changes should still occasionally happen to escape local optima.
  • Respects locality (nearby solutions are more similar).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why do we use Gaussian Tweaks?

A

Mimics natural behaviour of search:
Allows for many small changes (local exploitation) and occasional large jumps (global exploration)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do Gaussian Tweaks work?

A

Replace uniform noise with normal noise.

σ becomes a hyper-parameter controlling the rate of exploration.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the advantages/disadvantages of Gaussian Tweaks

A

Advantages:
- Respects locality.
- Enables scalable exploration.
Disadvantages:
- Choosing the right σ is hard — too small = stuck; too large = too random.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Algorithm 11 - Gaussian Tweaks

A

REFER TO NOTES - same a bounded uniform but r is replaced by sigma^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the difference between Gaussian Tweaks and Uniform Tweaks?

A

Uniform Tweaks:
Fixed bounds
Locality - all sizes are equal
Exploration is capped by r
Needs artificial bounds

Gaussian Tweaks:
Infinte range
Locality - small changes more likely
Exploration is possible
Tuned naturally by sigma

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the Interdependence of Hyper-Parameters?

A

Not to treat hyper-parameters as if they act independently. In practice, they interact, and this interaction can be nonlinear, non-obvious, and highly problem-specific.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why do Interdependence of Hyper-Parameters matter?

A

Why this matters:
* In your optimiser, you might have:
○ σ: tweak size
○ n: number of candidates sampled per iteration
○ p: probability of tweaking each dimension
○ T: restart time distribution
* Tuning these one at a time may not work because:
A change in one hyper-parameter might amplify or cancel out the effect of another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the core ideas of tweak noise (sigma) interacting with samples per iteration

A

σ (large) - Makes tweaks more extreme → better for exploration
n (large) - Increases selective pressure → reducing exploration

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why is it not always predictable (tweak noise (sigma)) interacting with samples per iteration)?

A

Suppose you sample a large jump with σ = 1.0:
- It might explore a new peak (good!)
- But unless it’s immediately better, it could be culled if n is high
That means:
- Exploration is attempted, but not preserved

How well did you know this?
1
Not at all
2
3
4
5
Perfectly