Week 8 High-Level Vision Continued Flashcards
What is the role of attention mechanisms in vision tasks?
Attention mechanisms improve performance on high-level vision tasks by focusing on important features.
What does scene graph generation identify?
Scene graph generation identifies objects, attributes, and their relationships in images.
What are graph neural networks (GNNs) used for?
GNNs are commonly used for scene graph tasks.
What is human pose estimation?
Human pose estimation predicts key body points for activity recognition.
What are GANs used for in vision?
GANs produce realistic images from noise for applications like style transfer and image inpainting.
What is video captioning?
Video captioning generates temporal descriptions of dynamic content.
What is emotion recognition?
Emotion recognition identifies facial expressions and emotions from images or videos.
How does video scene understanding extend high-level vision?
Video scene understanding expands high-level vision to dynamic content.
What are multimodal approaches in high-level vision?
Multimodal approaches combine images, text, and audio for richer understanding.
What is the importance of self-supervised learning in high-level vision?
Self-supervised learning methods improve segmentation and representation tasks by leveraging unlabeled data.