lesson_9_flashcards
What is image segmentation?
A computer vision task where each pixel of an image is classified into categories, providing a detailed understanding of the scene.
What is the difference between semantic and instance segmentation?
Semantic segmentation classifies each pixel without distinguishing instances, while instance segmentation assigns unique IDs to different objects.
What is the purpose of encoder-decoder architectures in image segmentation?
Encoders extract abstract features through downsampling, while decoders upsample to restore spatial resolution for pixel-level classification.
What are single-stage object detectors?
Models like YOLO and SSD that directly predict bounding boxes and class labels for objects in one pass through the network.
What are two-stage object detectors?
Models like Faster R-CNN that first propose regions of interest (ROIs) and then refine them through classification and regression.
What is non-maximum suppression (NMS)?
A technique to remove redundant bounding boxes by keeping the box with the highest confidence score in overlapping regions.
What is ROI pooling in object detection?
A method to resize regions of interest (ROIs) to a fixed size before feeding them into fully connected layers for classification and regression.
What are anchor boxes in object detection?
Predefined bounding boxes of different scales and aspect ratios used to detect objects at various sizes and positions.
What is mean average precision (mAP)?
A performance metric for object detection that averages precision across all recall levels and categories.
What are transposed convolutions?
Also known as deconvolutions, they upsample feature maps by reversing the process of convolution to increase spatial resolution.
How do single-stage and two-stage detectors compare?
Single-stage detectors are faster but less accurate, while two-stage detectors are slower but more precise.
What is Mask R-CNN?
An extension of Faster R-CNN that adds a branch for pixel-level segmentation, enabling instance segmentation.
What is the role of skip connections in UNet architectures?
They transfer high-resolution feature maps from the encoder to the decoder, preserving spatial information for better segmentation.
What is intersection over union (IoU)?
A metric used in object detection to measure the overlap between predicted and ground-truth bounding boxes, with values ranging from 0 to 1.
What is the role of feature pyramids in object detection?
They improve multi-scale detection by combining features from different layers, enabling detection of small and large objects.