Week 8 High-Level Vision Continued Flashcards

Question 1

Q

What is the role of attention mechanisms in vision tasks?

Answer

A

Attention mechanisms improve performance on high-level vision tasks by focusing on important features.

Question 2

Q

What does scene graph generation identify?

Answer

A

Scene graph generation identifies objects, attributes, and their relationships in images.

Question 3

Q

What are graph neural networks (GNNs) used for?

Answer

A

GNNs are commonly used for scene graph tasks.

Question 4

Q

What is human pose estimation?

Answer

A

Human pose estimation predicts key body points for activity recognition.

Question 5

Q

What are GANs used for in vision?

Answer

A

GANs produce realistic images from noise for applications like style transfer and image inpainting.

Question 6

Q

What is video captioning?

Answer

A

Video captioning generates temporal descriptions of dynamic content.

Question 7

Q

What is emotion recognition?

Answer

A

Emotion recognition identifies facial expressions and emotions from images or videos.

Question 8

Q

How does video scene understanding extend high-level vision?

Answer

A

Video scene understanding expands high-level vision to dynamic content.

Question 9

Q

What are multimodal approaches in high-level vision?

Answer

A

Multimodal approaches combine images, text, and audio for richer understanding.

Question 10

Q

What is the importance of self-supervised learning in high-level vision?

Answer

A

Self-supervised learning methods improve segmentation and representation tasks by leveraging unlabeled data.

(10 cards)