Lecture 15 - Population Based Algorithms Flashcards
What are Population Based Methods?
The goal is to handle complex, high-dimensional, black-box optimisation problems where traditional methods (e.g., brute-force, grid search, hill climbing) fail.
What challanges make it hard to handle complex methods?
The challenges that are making this hard are:
- Expensive evaluations (e.g., simulating a weather model)
- Multiple local optima (modal spaces)
- No clear analytical form of the function
What is the Key Question?
Can a collection of points tell us more together than each one alone?
- Core idea of population-based methods: Maintain and evolve a set of solutions rather than a single one to better explore and exploit the search space.
○ Want to make every evaluation count
§ Every evaluation gives you information about the search space (black/grey box)
§ Want to utilise / learn from that information as much as possible.
□ What does the space look like?
□ What heuristics are appropriate?
□ Where should we look for the best solutions?
How is the Key Question answered?
- To do this inspiration is taken from the environment
○ Source: Evolutionary processes in nature.
○ Features:
* Adaptation to environment
* Many trials (organisms)
* Sharing of information (genetically or socially)
How can Evolution also be used?
Evolution can also be used:
○ Works in vastly different environments
○ Maintains lots of candidate solutions
○ Seeks to optimise performance
○ Iteratively improves solutions * Collectively “learns” about environment
* “meta-learning”?
* capabilities stored in genome
○ Information/capability sharing
* within generations (social beings)
* between generations
What are some types of Evolution?
○ Darwinian: Natural selection
○ Lamarckian: Acquired traits passed on (debunked but revived via epigenetics—some traits can bypass full reprogramming)
What is Epigenetic Inheritance?
Signals from the outside world can work through the epigenome to change a cell’s gene expression.
Epigenetic tags act as a kind of cellular memory. A cell’s epigenetic profile — a collection of tags that tell genes whether to be on or off — is the sum of the signals it has received during its lifetime.
What is Evolutionary Computation?7
- Algorithms inspired by biological evolution.
○ Typically: nature-inspired computing or evolutionary algorithms - Includes:
○ Evolutionary Strategies (ES)
○ Genetic Algorithms
○ Genetic Programming
○ Evolutionary Programming
Common Definitions
REFER TO SLIDES
What are the Common Steps in Population-Based Algorithms?
- Create (random) initial population
- Assess/evaluate fitness (quality)
- “Breed” new population of offspring
- “Join” from parents and children to form next generation
Evolutionary Algorithm and How it Works?
REFER TO SLIDES
What are the Key functions of Evolutionary Algorithm?
BuildInitialPopulation()
AssessFitness(P)
Breed(P)
Join(P, Breed(P))
What is BuildInitialPopulation()?
- What it does: Creates the initial population of candidate solutions.
- Heuristics:
○ Can be random (uniform, Gaussian, etc.)
○ Can be biased (if prior knowledge is available to guide the initial search)
○ Risk of bias: might miss important regions of the space.
What is AssessFitness(P)?
- What it does: Evaluates how good each candidate solution is.
- Heuristics:
○ Defines the objective function being optimized.
○ Can be simple (e.g., error rate) or complex (e.g., simulation outcomes).
○ May include penalties for invalid solutions (constraint handling).
What is Breed(P)?
- What it does: Generates offspring from the current population.
- Heuristics:
○ Selection: Choose which individuals become parents (e.g., tournament, roulette, rank-based).
○ Variation:
§ Mutation: Small random changes (e.g., Gaussian noise).
§ Crossover: Combine parts of two parents.
○ Mutation Rate: Often adaptively tuned (e.g., Rechenberg’s 1/5th rule).
What is Join(P, Breed(P))?
- What it does: Forms the next generation.
- Strategies:
○ (μ, λ): Keep only offspring — more exploratory.
○ (μ + λ): Combine parents and offspring — more exploitative.
○ May use elitism (keep best solutions always).
What are the Advantages of Evolutionary Algorithm?
- Explores broadly: Maintains a population, reducing the chance of getting stuck in local optima.
- General-purpose: Works on black-box, noisy, or non-differentiable problems.
- Parallel-friendly: Fitness evaluations can be run in parallel.
- Robust: Handles noise and uncertainty well.
- Flexible: Can incorporate domain knowledge or be hybridised with other methods.
What are the Disadvantages of Evolutionary Algorithm?
- Computationally costly: Evaluating many individuals every generation is expensive.
- Requires tuning: Parameters like population size and mutation rates need careful adjustment.
- Can converge prematurely: Risk of losing diversity and getting stuck in local optima.
- Slower precision: May take longer to refine to an optimal solution.
- Stochastic: Results may vary between runs due to randomness.
What are Evolutionary Strategies?
They are intuitive, biologically-inspired algorithms for optimisation, particularly useful in continuous, high-dimensional, and black-box problems.
What are the Key Concepts in Evolutionary Strategies (ES)?
Truncation Selection
What is Truncation Selection in ES
- After evaluating all individuals in the population, we select the top-performing (fittest) ones — specifically, the best μ individuals out of λ.
- This is called truncation because we “cut off” the rest — only the top survive.
What is Mutation as Tweak IN ES
- Instead of complex recombination (like crossover in genetic algorithms), ES often uses mutation as the main way to generate new solutions.
- Mutation means slightly altering a parent to create variation in the offspring.
What is the Simplest Form: (μ, λ) Strategy
This is the core ES loop:
Step-by-Step:
1. Start with λ randomly generated individuals
→ These are your initial candidates.
2. Evaluate them using the fitness function
→ Apply AssessFitness() to each individual.
3. Select the top μ individuals
→ These become the parents (via truncation selection).
4. Mutate each parent to create λ offspring
→ Each parent produces λ⁄μ children (even distribution)
→ This is your Breed() step.
5. Replace the parents with the new children
→ All μ parents are discarded. Only children move forward.
→ This is the Join() step.
6. Repeat the process over generations
→ With the hope that each new generation gets closer to an optimal solution.
(μ, λ) Strategy Algorithm
REFER TO SLIDES
Why does the (μ, λ) Strategy work?
- Encourages exploration: Parents are not reused, reducing risk of local optima trapping.
- Truncation ensures only high-quality parents breed.
- Mutation introduces new variations in every generation.
What are the Tuning Parameters for (μ, λ) Strategy?
λ – Population (Sampling) Size
μ – Selectivity (Number of Parents)
Mutate() – Variation Operator
What is λ – Population (Sampling) Size
“This is how many candidate solutions (children) we generate per generation.”
Bigger λ = more coverage of the search space (more exploration).
It’s similar to n in steepest ascent hill climbing (number of directions sampled).
But: Bigger λ means higher computational cost.
If λ → ∞, the strategy becomes just random search, since you’re covering the space blindly.
What is μ – Selectivity (Number of Parents)
“This controls how picky we are about who gets to be a parent.”
Smaller μ = more exploitation: focusing only on top performers.
Larger μ = more diversity, allowing weaker candidates to contribute (more exploration).
Too small a μ may cause premature convergence.
What is Mutate() – Variation Operator
“This controls how much randomness we inject into the children.”
Mutation probability and mutation strength determine:
How far the children can move from the parents.
Whether we explore new areas or fine-tune current solutions.
Mutation plays a critical role in avoiding local optima and encouraging discovery.
What are the Advatantages of (μ, λ) Strategy?
- Encourages diversity by not reusing parents.
- Useful in early stages of search when you want to explore the space.
- Reduces risk of getting stuck in local optima.
- Simpler and sometimes more parallelisable, since only offspring are evaluated.
What are the Disadvatantages of (μ, λ) Strategy?
- Can lose good solutions because parents are thrown away.
- Slower to refine near good solutions.
- May need larger λ to maintain enough diversity for progress.
What is the (μ + λ) Strategy?
“In the (μ + λ) strategy, we start with μ parents, generate λ children, then pick the best μ from both parents and children combined to be the next parents.”
REFER TO SLIDES FOR CODE
When to Use (μ + λ) Strategy?
- When you want to preserve good solutions over time (elitism)
- Good if you’re close to convergence and want refinement
- But beware of premature convergence if diversity is lost too early
What are the Advatantages of (μ + λ) Strategy?
- Ensures good solutions are not lost (elitism).
- More exploitative — good for fine-tuning and convergence.
- Fitness usually improves or stays stable over generations.
- Often more efficient in late-stage optimisation.
What are the Advatantages of (μ + λ) Strategy?
- Risk of premature convergence — strong parents can dominate and reduce diversity.
- Less exploratory — may not escape local optima once stuck.
Needs mechanisms (like mutation or diversity control) to maintain variation.
Comparison Between Evolutionary Strategies
REFER TO SLIDES
What are the main differences between Evolutionary Strategies?
- “In the (μ + λ) strategy, the offspring compete with the parents to survive into the next generation. In (μ, λ), the parents are thrown out and only the children are considered.”
○ This key change makes (μ + λ) more conservative and focused on exploitation, while (μ, λ) is more exploratory.
Exploitation vs Exploration
- “Because (μ + λ) keeps the parents in the running, it’s better at preserving good solutions — but that also means it’s more likely to get stuck in a local optimum.”
○ (μ + λ): Less chance of losing good solutions → More exploitation
○ But: Also more likely to converge prematurely if diversity is lost
====
NOTE:
Think of population size (μ) and offspring count (λ). When these are small, population algorithms reduce to single-state strategies.
- Similar to the three hill climb examples
What are some things to Remember in population based approaches?
- Maintaining Diversity
○ “In population-based algorithms, it’s important to maintain diversity, especially early on. This helps the algorithm explore the space and avoid getting trapped.” - Gradual Convergence
○ “Over time, it’s normal to reduce diversity to encourage convergence — that is, to focus the search around the best regions we’ve found.”
○ This is called exploitation — zooming in on the best. - Premature Convergence
○ “But if diversity drops too quickly, we get premature convergence — the population becomes too similar, and we stop exploring. This can cause us to miss better solutions elsewhere in the space.”
○ This is dangerous because we can’t be sure we’ve found a globally good solution — only a local one.
What is Adaptive Mutation?
Typical use
- fixed-length vector of real-valued numbers (“chromosome”)
- mutation performed using “Gaussian Convolution” (Alg. 11)
- recall Gaussian mutation controlled by (or )
- called the mutation rate of ES
How do you chose the mutation rate in Adaptive Mutation?
- guess?
- run experiments to find a good value for the problem at hand
- run a meta-optimisation!
- decrease over time (cf. simulated annealing)
- adaptively change based on some statistic(s) of the system…
Example of Adaptive Mutation - Rechenberg
REFER TO SLIDES
What is Self-Adaptive Mutation?
- “It’s the idea that the mutation parameters themselves (like how much to mutate or how likely) can evolve along with the individuals.”
○ In other words, instead of manually setting or adapting the mutation rate globally, we let each individual carry its own mutation settings — and those settings can mutate and evolve too.
How does Self-Adaptive Mutation work?
- Each individual:
○ Has a solution (e.g., vector of variables).
○ Has a mutation strategy or parameter (e.g., σ — mutation step size). - When the individual reproduces:
○ It copies and mutates both:
§ Its solution.
§ Its mutation settings. - Over generations:
○ Good mutation strategies survive and propagate.
○ Bad ones are discarded with poor solutions.
This is what we mean by “mutation operators themselves might mutate.”
What are the benefits of Self-Adaptive Mutation?
- Encourages dynamic, local adaptation.
- Allows the algorithm to adjust automatically to different parts of the search space.
- Particularly useful for complex or rugged landscapes like the one shown (Rosenbrock’s function in the image).
What are some important things to consider with Self-Adaptive Mutation?
- “Imagination is the limit” — but don’t go overboard.
- You must justify added complexity:
- Use Occam’s Razor: keep things simple unless complexity clearly helps.
- Only keep the feature if empirical results prove it’s beneficial.