Lecture 3 Flashcards by Sven Dukker

Why is (standard) crossover not a good idea for real-valued problems?

The range of possible values for a single gene is way too large to keep selecting genes from the initialised genotypes. Hence we need to create new gene values based on the genotypes in each generation.

How well did you know this?

Not at all

Perfectly

Explain how Classic Differential Evolution creates new solutions

You start by initialising values within the range for each gene.
In order to generate new values you do the following steps:
You pick a main individual as a base.
Then you randomly select 3 other individuals from the same population.
You perform v = x0 + F(x1 - x2) in order to create a new temporary solution.
Then you can perform uniform crossover between genes of the base individual and the temporary solution.
( 7. Now it is selection time )

How well did you know this?

Not at all

Perfectly

In Classic Differential Evolution, what is the variable ‘Cr’ and how does it operate?

The crossover possibility.
The chance that the newly generated solution is picked over the base solution

How well did you know this?

Not at all

Perfectly

What type (of structures) are the genotype of real valued solutions?

Vectors

How well did you know this?

Not at all

Perfectly

In Classic Differential Evolution, does the increase of parameter F mean an increase in diversity?

Generally yes.

How well did you know this?

Not at all

Perfectly

What are the most affecting parameters in Classic Differential Evolution?

population size
how the individuals are samples.
scale factor F

How well did you know this?

Not at all

Perfectly

What does topology mean for CPSO?

The way that the neighbourhoods are set up; which individual can see which other individual.

How well did you know this?

Not at all

Perfectly

What does CPSO mean?

Classic Particle Swarm Optimalisation

How well did you know this?

Not at all

Perfectly

For what type of problem is the CPSO very efficient?

Many local optima

How well did you know this?

Not at all

Perfectly

What is one of the biggest problems with CPSO?

Due to their parameters, they can either oscilate very heavily, or be too rigid (which both result in more generations and more computation time.)

How well did you know this?

Not at all

Perfectly

What are the three components of CPSO?

inertia-, cognitive-, and social component

How well did you know this?

Not at all

Perfectly

Which component(s) of CPSO are influenced by randomness and why?

cognitive and social, the randomness is introduced in how heavy these components weight. This leads to more diversity in the next generation.

How well did you know this?

Not at all

Perfectly

In CPSO what does the z_{i,g}_ variable mean and in which component is it ?

It means the best ever found solution in the current population. it can be found in the social component

How well did you know this?

Not at all

Perfectly

In CPSO, how are new solutions generated?

It takes the inertia from its parent, and adds the (weighted) cognitive and social component. Then it adds this new velocity (or inertia) and adds this to the genotype of it’s parent.

How well did you know this?

Not at all

Perfectly

How many parents does offspring have in CPSO?

How well did you know this?

Not at all

Perfectly

Which parameters for CPSO have the most influence?

swarm size
neighborhood topology
Coefficients (i.e. inertia weight and acceleration constants)

How well did you know this?

Not at all

Perfectly

What does convex combinations mean?

If you imagine a triangle, then the convex combination lays within the surface of that triange.

Or in case of EA:
A new solution does not ponder outside of the solution space.

How well did you know this?

Not at all

Perfectly

Does CSPO exist of convex combinations and why so?

Yes, the new solution is created from the three components of it’s parent.
These components are weighted between [0,1) and thus stay within the search space of the problem.

How well did you know this?

Not at all

Perfectly

What is the problem with CSPO?

Study These Flashcards

The algorithm really aims to convolute its population fast. Although** this increases diversity,** this means that it is very likely that the algorithm convolutes on local optima, as the real solution gets out of reach of the search space.

How can you tell that variation conciders covariance when looking at the Gaussian plot of that variation?

Study These Flashcards

When the gaussian shows an angled eclipse.

What does a (standing) ecplise mean in gaussian variation?

Study These Flashcards

It represents how much each individual is allowed to vary when sampled with it. : It basically shows the bounds of variation for each gene.

In what EA branch/variant is normal distribution varation central?

Study These Flashcards

Evolutionary Strategies (ES)

What is the similarity and difference between mutation and crossover?

Study These Flashcards

They are both forms of variation. Crossover uses multiple ‘parents’ to draw new solutions.
Mutation uses a probability model on a singular parent’s genotype in order to create a new sample.

What are the properties of ES genotypes?

Study These Flashcards

They do not only contain a solution,
But more importantly it includes parameters that encode its normal distribution (for mutation)

What are the three variants in ES?

1. One variance, no covariances (𝑥0, 𝑥1, ... , 𝑥ℓ−1, 𝜎) 2. ℓ variances, no covariances (𝑥0, 𝑥1, ... , 𝑥ℓ−1, 𝜎0, 𝜎1, ... , 𝜎ℓ−1) 3. Full covariance matrix (general ellipsoidal density contours) (𝑥0, 𝑥1, ... , 𝑥ℓ−1, 𝜎0, 𝜎1, ... , 𝜎ℓ−1, 𝛼1, 𝛼2, ... , 𝛼ℓ(ℓ−1)/2) (note: 𝛼𝑖 are rotation angles)

What variant of ES would create elipsoidal mutation distributrions?

Variant 2: ℓ variances, no covariances. (𝑥0, 𝑥1, ... , 𝑥ℓ−1, 𝜎0, 𝜎1, ... , 𝜎ℓ−1)

For the third ES variant, why do we sample angles and not covairances?

Because covariance matrix must remain .positive definite. So the variance encode the stretching and the rotation ensure the convairance matrix is always 'proper': we can sample it.

For the third ES variant, how many rotation angles do we expect in each genotype?

ℓ(ℓ−1)/2 Because each genotype has covariance with all genotypes apart from itself, and we dont need double definitions (hence the 2)

How do we represent the genotype of an individual in ES?

(𝒙, 𝒔), where * 𝒙 is the vector of **encoded** problem variables and * 𝒔 is the **decoded** vector of (normal) strategy parameters

In ES, why do the strategy parameters also need to undergo mutation?

This is to better control the search space: It can have a high variance in early generations, but we want low variance in later generations. **"meaning mutation"**

In ES, what does 𝜓(𝒔) represent?

It is the **encoded** covariance matrix that is generated from the strategy parameters (𝒔). Such that (𝒙, 𝒔) → (𝒙 + ∆𝒙, 𝒔) where ∆𝒙 ~ 𝒩(0, 𝜓 𝒔 )

What is self adaption (in ES)?

The mechanism to control own mutability.

In ES what does selection influence?

fitness directly, distribution quality indirectly

The mathematical definition of log-normal multiplicatively mutated variance in **ES variant 1**.

𝜎 → max {𝜎𝑒^𝑧, 𝜀_𝜎_} where𝑧 ~ 𝒩(0, 𝜏0^2)

The mathematical definition of log-normal multiplicatively mutated variance in **ES variant 2 & 3**.

𝜎 → max {𝜎𝑒^{𝑧+𝑧i}, 𝜀_𝜎_} where 𝑧 ~ 𝒩(0, 𝜏0^2) and 𝑧i ~ 𝒩(0, 𝜏"0^2)

What are the reasons to prefer lognormal mutation?

– Standard deviations remain positive – Median equals one (for 𝜇 = 0) – Smaller modifications more likely than larger ones

In ES variant 3, how are the covariance and rotation angle mutated?

lognormal multiplcatively & normal additvely (𝛼𝑗 ← 𝛼𝑗 + 𝑧𝑗 where 𝑧𝑗 ~ 𝒩(0, 𝛽2))

What is recombination in ES?

The act of creating a **singular** offspring by selectiong components from the genotype of parents.

What is Global intermediary recombination?

Average genotype over all parents

What is Local intermediary recombination?

A Convex combination of 2 parents

What is discrete recombination?

Pick one of the parents.

In ES, what symbol do they use for population size and offspring size?

𝜇 and 𝜆

Does ES have high selection presure?

Yes. Selection presure can be set using the parameters, but usually it is set very high. The selection pressure occurs because you take multiple samples for a single variable, of which only the fittest survives.

In what cases do ES variant 1 and 2 excel?

Real value problems where there is no strong correlation between variables.

Where does variant 3 of ES excel at?

It can be more thorough, but often it does not hold up because of the **slow self-adaptation**.

Why does ES variant 3 have slow self adaptation?

Imagine if you have a gene that has a variance a million times larges than another gene. When you rotate that covariance matrix (by mutating the rotation variable), the sampling space of that variable becomes incredibly skewed. Hence it will take a very long time for the ES to adapt all these variables.

Lecture 3 Flashcards

(46 cards)