Lecture 7 Flashcards
What kind of distribution do real valued EDAs require?
practically useful, such as normal distribution
What is ML? (in EDA)
Maximum Likelihood
What is the limitation of ML?
Can only model linear-like dependencies.
What is the difference between ES and normal based EDA?
ES uses normal distributions for self adaption of models. The model updates implicitly through selection and random mutation
normal based EDA Explicitly couples population to model-update rules by performing estimation on the direction of improvements.
When can the direct use of ML-normal in a EDA have a positive result
- The Function is unimodal (one peak)
- The function is centered at origin
- Easy to converge towards minimum
What is the big downside of using direct ML-normal in EDA
- Structure of solution is very complicated and hardly matches normal distribution
- Improving directions are ignored in MLE
- The EDA does not observe the direction of the population
- Hence, no exploration ouside of the data range, so real optimum can easily be missed.
What would we expect to observe over multiple generations of Direct ML normal on EDA?
The algorithm tries to find a distribution that best fits the observed data, rather than the best solution in solution space. (skewed initial data stays skewed)
Explain the premature convergence problem
In direct ML-normal on real valued EDA, the variance of the normal distribution estimation will convergence to 0 very fast before the search space has been explored.
Why is Gradient Hybridization not a great solution for real values EDAs?
It requires gradient information, which is not always possible in complex problems, and not always reliable.
What are the three ingredients for Adaptive ML estimation?
- Adaptive Variance Scaling (AVS)
- Standard Deviation Ratio (SDR)
- Anticipated Mean Shift (AMS)
What is AVS?
Adaptive Variance Scaling
What is SDR?
Standard Deviation Ratio
What is AMS?
Anticipated Mean Shift
In SDR-AVS, what is the NIC counter?
When there are multiple local optima in your problem, it will take SDR-AVS too long to converge to one of them. It limits the adaption of variance in the estimation distribution.
Explain what the distribution muliplier does for SDR-AVS.
It will enlarge the variance of which new samples are taken