Lecture 12 - Detection Flashcards
What is object detection?
Object detection is the process of identifying and locating objects within an image, typically providing a bounding box and a class label for each detected object.
Explain the sliding window approach in object detection.
The sliding window approach involves scanning the image with a fixed-size window at different scales and positions, applying a classifier to each window to detect objects.
What is the Viola-Jones face detector?
The Viola-Jones face detector is a real-time object detection framework that uses integral images for fast feature evaluation, boosting for feature selection, and a cascade of classifiers for high detection rates.
Describe the generalized Hough transform.
The generalized Hough transform is a method for detecting arbitrary shapes by mapping edge points in the image to a parameter space and identifying peaks in the accumulator array that correspond to shape instances.
What are region-based methods in deep learning for object detection?
Region-based methods, such as R-CNN, Fast R-CNN, and Faster R-CNN, involve generating region proposals, extracting features from these regions, and classifying them to detect objects.
Explain the YOLO (You Only Look Once) detection method.
YOLO is a real-time object detection method that divides the input image into a grid, with each grid cell predicting bounding boxes, confidence scores, and class probabilities for objects whose centers fall within the cell.
What is a Feature Pyramid Network (FPN)?
An FPN is a deep learning architecture that creates a feature pyramid from a single input image, enabling object detection at multiple scales by combining features from different layers of a convolutional network.
Describe instance segmentation.
Instance segmentation is a task that combines object detection and semantic segmentation, providing a pixel-wise mask for each detected object, distinguishing between different instances of the same class.
What is pose estimation?
Pose estimation involves detecting and predicting the spatial configuration of an object’s key points, such as joints in a human body, often used for applications like action recognition and augmented reality.
Explain 3D object detection.
3D object detection involves identifying and localizing objects in three-dimensional space, often providing 3D bounding boxes or poses, and is used in applications like autonomous driving and robotics.
What is the purpose of non-maximum suppression in object detection?
Non-maximum suppression is used to remove redundant bounding boxes for the same object by selecting the box with the highest confidence score and discarding others that have a high overlap (IoU) with it.
Describe the concept of anchor boxes in object detection.
Anchor boxes are predefined bounding boxes of different scales and aspect ratios used in region proposal networks (RPN) to detect objects at multiple scales and locations within an image.
Write the formula for the confidence score in YOLO.
Provide the formula for the weighted error in AdaBoost.
What is the formula for updating weights in AdaBoost?