lesson_11_flashcards
What is structured representation in deep learning?
Representing relationships between elements explicitly, such as words, pixels, or nodes, to model compositional structures across domains like language and vision.
What is a scene graph?
A graph-based representation where nodes are objects or object parts, and edges represent relationships like spatial arrangements or actions.
What are recurrent neural networks (RNNs)?
Neural networks designed for sequential data, maintaining a state vector that represents past inputs while processing sequences of arbitrary length.
What is the vanishing gradient problem in RNNs?
Gradients become too small during backpropagation through time, making it difficult to learn long-term dependencies.
What is attention in deep learning?
A mechanism to focus on relevant parts of input data dynamically, weighting elements using similarity scores for better feature representation.
What is the softmax function’s role in attention?
It converts similarity scores into probabilities, enabling weighted summations for attention mechanisms.
What are transformer architectures?
Models that use attention-based mechanisms, including multi-head attention, to process sequences or unordered sets efficiently.
What is a non-local neural network?
A network that dynamically learns connectivity patterns between data points using attention mechanisms, generalizing beyond local receptive fields.
How are graph neural networks (GNNs) structured?
Nodes represent entities with feature vectors, and edges represent relationships, enabling propagation of information across the graph.
What is the role of embeddings in GNNs?
They represent nodes or elements as vectors, incorporating local and neighborhood features through attention mechanisms.
What is a sequence-to-sequence (seq2seq) task?
A task where a sequence of inputs is mapped to a sequence of outputs, such as machine translation or speech recognition.
What is the benefit of multi-head attention in transformers?
It allows the model to focus on different aspects of the data simultaneously, improving representation learning.
What is an example of a many-to-many task in sequential modeling?
Speech recognition, where an input sequence of sound waves is mapped to an output sequence of words.
What is the application of scene graphs in computer vision?
Scene graphs can describe spatial relationships in images, aiding tasks like object detection, relationship modeling, and image captioning.
How does attention enhance graph representations?
By weighting neighbors dynamically, attention refines node features, enabling context-aware embeddings.