Lecture 8 - Gradient Methods Flashcards

Question

What are the limitations of Gradient Ascent/Descent?

Answer 1

Needs careful choice of step size - too big it will overshoot, too small slow convergence May get stuck in local minima Slows down near the minimum - as gradient gets too small

Answer 2

A local optimum is a point where the function is better than its immediate neighbours.

Answer 3

A global optimum is the absolute best value across the entire domain.

Answer 4

- There is no general-purpose algorithm that can guarantee finding the global optimum in all cases. - Although we can still find good solutions, even if we can't guarantee the best one. ○ Why is this important? * Because in the real world, we: □ Rarely need perfect solutions □ Often just want “good enough”, quickly □ Work in messy domains with weird curves, noise, or constraints

Answer 5

* We use: □ Gradient descent with restarts □ Stochastic methods (e.g., simulated annealing, genetic algorithms) □ Heuristics or metaheuristics (e.g., greedy strategies, random sampling) * These don't guarantee global optima, but they explore the space better and often find excellent local optima.

Answer 6

Because they follow the gradient, which only leads to a local minimum or maximum. In non-convex functions, there may be multiple optima. Without exploring the entire space (which is often infinite or too large), there’s no way to know if a better one exists elsewhere.

Answer 7

Non-Enumerable Domains - the domain is infinite, this is an issues as you cant test every possible value Huge Search Spaces - So many possibilites that you cannot search them in reasonable time, this is an issue as you might need too long which defeats the purpose

Answer 8

This method is an enhancement of basic gradient ascent. It helps address the problem of getting stuck in a local maximum. - Instead of trusting just one run of gradient ascent, you try multiple random starts and keep track of the best solution you find. - This increases your chances of getting closer to the global maximum.

Answer 9

* Basic gradient ascent is deterministic: it follows the same path from the same start point. * If it starts near a local maximum, it may stop there and miss better peaks. * Restarting from different points gives a better chance to explore more of the space.

Answer 10

REFER TO NOTES

Answer 11

* The inner loop stops when ∣∇f(x)∣<ϵ→ slope is nearly zero (flat) * The outer loop stops when: ○ You’ve run a fixed number of iterations ○ Or a time/resource limit is reached

Answer 12

Gradient ascent can get stuck in local maxima. By restarting from different random positions, the algorithm explores multiple regions of the search space. Comparing function values across all runs helps identify the best peak found, even if it’s not the global optimum.

Answer 13

You must be able to compute the actual function value f(x) (not just its gradient) to compare and retain the best result.

Answer 14

Through multiple random initializations and tracking the best result, it reduces the chance of getting stuck at a poor-quality local maximum.

Answer 15

Large, Space Search Spaces Local Optima Traps Plateaus Non-differential points Valleys in High Dimentions

Answer 16

Space of possible solutions is too big - This is a problem as you may be wasting time, Gradient method may converge too slowly, cant check every point

Answer 17

Points looks like a minimum or maximum but its not the global best - Problem as GD or NR methods stop at slope = 0, could be stuck in a local dip

Answer 18

A flat region where the slope is zero or nearly zero everywhere - Problem as GD relise on the slope to know where to move, very slow or no progress

Answer 19

No derivative at this point - Problem as gradient based function need the function to be differential, the slope is undefinedm breaking the behaviour

Answer 20

Narrow curved valleys where the gradient, points off the correct direction or changes slowly along one axis and slow along the other - Problem as GD sutrggles as it moves in one direction at a time, leads to slow inefficient convergences

Lecture 8 - Gradient Methods Flashcards

(44 cards)