Lecture 10 - Direct Methods Part 2 Flashcards
What is Hooke-Jeeves Method?
- Hooke-Jeeves (H-J) is a pattern search method:
○ It’s similar to CCS, but instead of just moving along each axis once per cycle, it explores both positive and negative directions per coordinate.
○ It doesn’t use derivatives — it’s fully direct.
Describe how the Hooke-Jeeves method works?
A pattern search method that minimizes a function without using derivatives.
How it works (step-by-step):
- Start at a point x⃗ in n-dimensional space
- Pick a step size α
- Try moving in each direction:
- For each coordinate axis e⃗ᵢ, evaluate the function at x⃗ + αeᵢ and x - αeᵢ
- This gives you 2n evaluations in total
Choose the best direction:
- If one of the moves improves the objective function, move there
- Repeat:
- If no move improves the function → reduce α and try again
- Keep repeating until you’re close enough to the minimum (small α or no more improvements)
How does Hooke-Jeeves compare to CCS?
HJ explores more directs per step (positive/negative), will potentially converge faster but is more exhausitve per iteration (due to its borader search)
What is Generalise Pattern Search?
GPS is a family of direct search methods used to solve optimization problems without derivatives.
- It extends Hooke-Jeeves (H-J) by allowing more flexible direction sets and customizable exploration rules.
Like H-J, GPS works by sampling around the current point, evaluating the function, and moving in the direction of improvement.
How does Generalise Pattern Search Work?
Start with a point x⃗ and a step size α
Define a pattern set of directions (could be standard unit vectors or something more complex)
Evaluate the function at each point:
- x + αd for each direction d⃗ in the pattern
Check for improvement:
-If any evaluated point gives a better value → move to that point
- If no improvement, reduce step size α and try again
Repeat until convergence (step size is small enough or no improvement after many tries)
What is the key requirement for Generalise Pattern Search?
The set of direction must be a positve spanning set
REFER TO NOTES
What are the trade offs in Generalise Pattern Search?
Faster but might miss better direction
Oppotunistic search
What is Oppotunistic Search?
The first best improved result is selected - a trade off for GPS
What is Dynamic Ordering?
After each iteration, directions that led to improvement are moved earlier in the list
For GPS what is Flexibility and Shrinking?
GPS uses a shrinkage strategy, where if no improvement is found it reduces the step size and tries again
What is Generalised Pattern Search, and how does it differ from Hooke-Jeeves? Why is a positive spanning set important?
A flexible direct search method (no derivatives) that explores multiple directions to find better solutions.
It extends Hooke-Jeeves by allowing any valid set of direction vectors, not just coordinate directions.
A positive spanning set is important as Positive spanning set D (the set of directions) must be able to reach any direction (ensures full space coverage). GPS also has optional features (opptunistic search and dynamic ordering).
What is Nelder-Mead Simplex Method?
- Designed for unconstrained minimization of scalar functions
- Requires no derivatives — suitable for noisy, non-smooth, or black-box functions
What are the key features of the Nelder-Mead Simplex Method?
Unlike CCS, Powell, or Hooke-Jeeves (which update one point), Nelder-Mead maintains multiple candidate solutions simultaneously — the vertices of the simplex.
This gives it the ability to:
* Explore more directions at once
* Shape-shift to match the function landscape
Make more informed decisions based on comparing the behaviour across multiple points
What are the operations used in Nelder-Mead?
Reflection
Expansion
Contraction
Shrinkage
Nelder-Mead Operation: Reflection
Reflects the worst vertex across the centroid of the remaining points
Moves away from bad solutions
Nelder-Mead Operation: Expansion
If the reflection gives a great result, try going further in that direction
Nelder-Mead Operation: Contraction
If the reflection worsened try stepping halfway towards the centroid
Nelder-Mead Operation: Shrinkage
If nothing inproves, shrink the whole simplex towards the best point
Why do the step (operations for Nelder-Mead) matter?
Each operation:
- Allows the simplex to move, reshape, or contract
- Helps it adapt to curvature, ridges, or flat regions in the function
How does Nelder-Mead differ from General Pattern Search?
Unlike pattern search, it learns from the shape of the current region
Stopping Criteria (Convergence Criteria) for Nelder-Mead
Nelder-Mead can use several ways to determine when to stop:
Step size
- If the simplex becomes very small, we’re likely near an optimum
Function improvement
- If the best and worst function values are almost the same, we’re in a flat region
Variance at vertices
- If function values at the simplex vertices have low variance, we’re likely done
- Acts like a curvature proxy: if curvature is high, variance is high — meaning more gains might still be found
What is Collective Intelligence in Optimization?
Instead of relying on a single best solution, some methods (like Nelder-Mead Simplex) operate on a group or population of solutions at once.
How does Collective Intelligence in Optimization relate to Nelder-Mead?
The simplex is a group of points (e.g., 3 points in 2D).
The method does not just follow the best point.
Instead, decisions (e.g., reflection, contraction, expansion) are made using:
- The centroid of all points except the worst one
- Comparisons across multiple points
- Geometric relationships like position, spread, and direction
Why is Collective Intelligence used in Nelder-Mead powerful?
No single point dominates:
- The algorithm doesn’t blindly follow the current “best” like greedy search.
- Instead, it considers the relative performance and geometry of the group.
===
Shared decision-making:
- Actions like reflection or shrinkage are based on collective performance.
- This helps avoid overfitting to one local region and allows adaptation to shape or curvature of the landscape.
What is Determinism Algorithms?
They know exactly where to go and start at the same points
* Cyclic Coordinate Search (CCS)
* Powell’s Method
* Hooke-Jeeves
* Generalised Pattern Search (GPS)
* Nelder-Mead Simplex
What are some of the problems with Determinism Algorithms?
Once you have a starting point you are removing any chance for the search space to be explored