C9 Flashcards
why does pre-training work especially well on deeply layered architectures?
the knowledge in the layers goes from generic to specific. The lower layers contain more generic information that is well suited for transfer to other tasks
what are foundation models?
they are large models in a certain field (eg. image recognition or NLP) that are trained extensively on large datasets. They contain general knowledge that can be specialized for a certain purpose
What is the reason for the interest in meta-learning and transfer learning?
we want to speed up learning a new task, using previous knowledge instead of learning from scratch. In transfer learning, we pretrain our parameter network with knowledge from a single task. In meta-learning, we use multiple related tasks.
what is transfer-learning?
networks trained on one dataset are used to speedup training for a different task, possibly using a much smaller dataset
what is meta-learning?
learning how to learn
How is meta-learning different from multi task learning?
In multi-task learning, more than one task is learned from one dataset. The tasks are often related, such as classifcation tasks of different, but related, classes of images.
In meta-learning, both datasets and tasks are different, but not too different. A sequence of datasets and learning tasks is generalized to learn a new (related) task quickly. The aim is learning to learn
what is domain adaptation?
needed when there is a change in the data distribution between the training and test dataset, eg. when items must be recognized with different backgrounds
goal: compensate for variation between two data distributions, to be able to reuse information from a source domain on a different target domain
what is the difference between meta-learning and machine learning?
machine learning learns parameters that approximate the function and meta-learning learns hyperparameters about the learning-function
what is few-shot learning?
test if a learning algorithm can be made to recognize examples from classes from which it has seen only few examples in training. Prior knowledge is available in the network.
what is the connection between meta-learning and curriculum learning?
Both approaches aim to improve the speed and accuracy of learning, by learning from a set of subtasks.
So curriculum learning is a form of meta-learning where the subtasks are ordered from easy to hard, or, equivalently, meta-learning is unordered curriculum learning
Zero-shot learning aims to identify classes that it has not seen before. How is that possible?
Attribute-based zero-shot learning uses separate high-level attribute descriptions of the new categories, based on categories previously learned in the dataset.
Eg. recognize a red beak because we have learned the concepts “red” and “beak”
is pre-training a form of transfer learning?
yes: some network layers are copied to intialize a network for a new task, followed by fine tuning, to improve performance on the new task, but with a smaller dataset
what is MAML?
a well-known deep few-shot learning approach
As the diversity of tasks increases, does meta-learning achieve good results?
For tasks that are related, good results are reported, but where tasks are less related (such as pictures of animals from very different species), results are reported that are weaker.