ES-2 Flashcards
Covariance Matrix Adaptation
- continuous optimization of non-linear, non-convex functions
- handle ill-conditioned problems such as discontinuities, (sharp) ridges, or local optima
- on convex-quadratic objective functions, setting the covariance matrix of the search distribution to the inverse Hessian matrix is equivalent to rescaling the ellipsoid function into a spherical one
- increase the likelihood of repeating successful steps
What is the objective of CMA ?
The final objective of covariance matrix adaptation is to closely approximate the contour lines of the objective function f
Increase the likelihood to repeat successful steps.
What method is similar to CMA ?
Learning the covariance matrix in the CMA-ES is similar to learning the inverse Hessian matrix in a quasi-Newton method.
How offspring are generated in CMA-ES
offspring is generated by sampling a multivariate normal distribution
CMA-ES with convex-quadratic (ellipsoid) objective function.
-any convex-quadratic (ellipsoid) objective function is transformed into the spherical function
what is the purpose to Combining Rank-μ-Update and Cumulation
handle various of population seize.
Influence of population size in CMA-ES ?
- Small population sizes usually lead to faster convergence,
- large population sizes help to avoid local optima.
Invariance property of cam es
- translation,
- rotation, and
- scaling of the coordinates
-strictly monotonic transformations
of function values
- Invariance to order preserving (i.e. strictly monotonic) transformations of the objective function value. The algorithm only depends on the ranking of function values.
- Invariance to angle preserving (rigid) transformations of the search space (rotation, re- flection, and translation) if the initial search point is transformed accordingly.
- Scale invariance if the initial scaling, e.g. σ(0), and the initial search point, m(0), are chosen accordingly.
(1+1)-CMA-ES
- use rank-1 update
- update the C using AA^T
- 1+1 cma-es on convex quadratic functions, O(n2) iterations are required for the covariance matrix to approach the inverse Hessian
(μ/μ, λ)-CMA-ES
- use rank-μ update
- using eigenvalue decompositions to obtain B and D such that C=AA^T where A=BDB^T
what are the two variants of restart CMA-ES
-IPOP-CMA-ES:
detect stagnation
restart with doubled population size
-BIPOP-CMA-ES interlaces two restart strategies:
one strategy uses an increasing population size, the other one a small population size
both have the same computational budget
what is differential evolution ?
differential evolution is a newer heuristic that uses differences between points in the population as mutation vectors
what is Particle Swarm Optimization ?
candidate solutions are referred to as particles, the population as a swarm
each particle has a velocity; previous best positions bi are stored for each particle i
Disadvantage of DE and PSO ?
like DE, PSO is not rotationally invariant
what is Information Geometric Optimization ?
- optimization in a space of probability distributions choice of a statistical model
- stochastic relaxation of the objective
- stochastic natural gradient descent, where the gradient estimate is obtained through Monte Carlo estimation