Evolutionary Algorithms Flashcards
Evolutionary Computation
a way to solve an optimization problem using a population of candidate solutions
and the dynamics of genetic evolution (selection, reproduction and mutation).
- It can be stopped at any time and give the current best solution
- It is stochastic so running the same algorithm twice will not generate the same solution twice
Three parts to algorithm:
- Initial population construction
- In iteration: fill mating pool with selected individuals (for reproduction) according to a selection
criterion (you can have repetitive individuals).
- Generate new population from mating pool (evolve and the average normally improves, higher quality solutions in pool)
Randomly generated set of possible solutions is the start of the algorithm (will be poor quality) but must be sufficiently diverse and arge and will act as the basis for better solutions. It evolves through each generation.
Phenotype/Genotype
Genotype – the representation of an individual. Most simple form is string of bits but can also be a tree or
graph. (set of genes)
Phenotype – genotype encoding lead to a solution (translation might be needed from the genotype to the phenotype). the thing encoded by the genotype (physical expression of genes). The performance can be tested on the phenotype.
Fitness
the criterion to be optimized. Its complexity will limit the number of individuals that make up the
population and limit the number of generations that can be computed. It’s a way of selecting candidate
individuals for both survival and reproduction (if not in the mating pool, you die off)
Functions:
Roulette wheel selection
Tournament selection
Roulette weel selection
Each individual take a more less big proportion on the weel, "turn" the wheel and select individual you end up on => Higher chance of being selected if you’re fitter. Can be tuned with Boltzmann or Gibbs: 𝑝𝑖 = 𝑒𝑓𝑖/𝑇 ---------- ∑ 𝑒𝑓𝑖 𝑁 /𝑇
where higher T (temperature) means more diverse (fit is less important)
Tournament selection
from a selection of size n choose the top x fittest candidates and add them to
the mating pool. Smaller tournaments add more diversity in the population → less likely to have very
fit candidates in the tournament. Tuning happens with tournament size k.
Crossover
For reproduction
two parents share their genetic code and generate offspring.
· Single point: cut at one point, take first part of parent 1, take second part of
parent 2 for child 1 and the opposite for child 2
· Double point: do the same as for single point, but then twice, generating
two ‘switch’ points.
· Uniform: for each gene in the genotype, flip a coin for which parent it gets
the gene from.
· If you don’t have string-based format, there is a high probability of creating
infeasible solutions (solution that is not diverse).
Mutation
random changes in the genotype. It helps to create more diverse population, by searching regions
uncovered by the initial population.
- Bit-flipping mutation: randomly selected bit gets flipped OR the genome gets flipped completely.
- Real value mutation:
· Randomly change values to boundary extreme values
· Uniformly generate a random number between the boundaries
· Non-uniformly generate a number between a certain range that is deemed the best
· Gaussian noise addition: add a random number to the outcome/gene.
If you’re not using strings but trees, an option is headless chicken crossover:
generate random subtree without it really having a parent.
Schemata theory
offers insight to why good solutions survive and get combined
to better solutions.
- Schema: way of defining the genes in a general form, where binary
numbers indicate fixed value positions and * indicates the value does not matter.
- Order: the number of significant bits (fixed positions) in the schema
- Defining length: the longest distance between fixed positions in the schema.
Schemas can be used to reason about the disruptive properties of mutation and crossover.
- Surviving mutation: p = (1 – pm)
o(s)
, where pm is the mutation probability of a single bit
· Schemas with higher order have a lower probability of surviving mutation.
- Surviving crossover: p = (1 – pc d(s)/n-1) where pc is the crossover probability, d(s) the defining length and n the number of bits.
· Schemas with longer defining length have a lower probability of surviving crossover.
Thus most likely to survive if: low order and short defining length.
The building blocks theorem
– if you want to have a well performing evolutionary algorithm, you have to identify
and recombine building blocks of high fitness to build entire solutions, where the blocks take the form of
schemas. Guidelines for designing your genotype:
1. Put related genes close together so that they can form schemas with low defining length (i.e. make
them one gene ideally).
2. Make each gene as independent from others as possible
building block have low order and short defining length
Genotype Design (Schema)
Related genes close together to form schema
each gene should be as independent as possible
Crossover Design (Schema)
For designing the crossover: give the crossover operator a low probability of destroying schemas.
Crossover and mutation are meant to search the whole space and combine genes into new solutions.
However, they can destroy the best solutions. Solution → elitism: copy the top x best solutions directly into
the new generation.
Cooperative coevolution
if by design the building blocks are independent, then they can be searched
independently. This is much faster and easier.
If all genes are independent in the population, we can just look at single gene and see how it evolved over the generations. And we can do this for each gene.
SNES
Separable Natural Evolution Strategy (SNES)
one Gaussian
distribution for each gene, where the genotype sample from the population is generated by sampling each
gene separately.
SNES: the means and standard deviations are updated according to a weighted sum of the genomes in the
generated population scaled by utility (the gradient), where fit individuals have a stronger vote in the
gradient. Mutation is covered by sampling each generation, its randomness ensuring diversity and preventing
sub-optimal convergence.
Premature convergence (niche solution)
if a fitness function is not convex but has multiple local optima in
which the algorithm can get stuck in. This can happen if too many individuals become similar. Avoiding
overproduction of similar individuals can be achieved with a niche penalty: adding a penalty to the fitness
evaluation of any group with sufficient similarity (niche radius).
Complex problems
Problem with evolutionary algorithms is that they do not scale well with problem complexity. However, SNES
and building block theorem assume that genes are independent → the problem can be divided into smaller
subproblems. Not the case for intrinsically complex problems.