Lecture 3 Flashcards
What is the motivation for AML
Democratization
Save human time
Better performance
Does not remove human but helps
CASH
Argmin for A and lambda on Loss function(A_{lambda}, TrainSet, ValSet)
Components of Algorithm Selection Framework
Problem Space
Feature Space
Algorithm Space
Evaluation Space
Algorithm Selection Problem
Slide 24
Average Ranking Method
r_i = avg(performance rank for algorithm i) on all datasets in D
Greedy defaults
Searching for a set of configurations:
Select best performer on all tasks (Sum, avg, median)
Add to the set the config that the max of the performances gets highest
Repeat
No Free Lunch Theorem
When taken across all learning tasks, all learning algorithms perform equally well.
(Not only applicable in AutoML or Meta-learning)
Evaluation on Few Datasets
Pros:
Clear which data used
Allows detailed study
Good overview
Cons:
Cherry picking
Generalization?
Still not all results revealed
Evaluation on Many Datasets: Challenges
How to select datasets?
Results table too large
Not all sets comparable
Learning curves: why?
Shows how different algorithms and configs learn
Learning curves for Early Stopping
Stop when learner is good enough (has converged)
Learning curves for Early Discarding
Stop when learner will not reach good enough in time
Learning curves for Data Acquisition
Stop when capacity curve has plateaud
Extrapolation on learning curves
Can give idea on how good a model will perform after more learning