Detection Flashcards

Question 1

Q

What are 2 region proposal methods approaches in context of object detection. Explain both

Answer

A

Selective Search: Selective Search generates potential object regions by iteratively grouping pixels based on color, texture, and similarity, producing a small set of region proposals.

R-CNN: A deep learning-based method that uses a Region Proposal Network (RPN) to propose object regions and predict objectness scores, integrated into the object detection pipeline for end-to-end training.

Question 2

Q

Region Proposal Networks (RNP) are an essential part of detection algorithms such as Faster R-CNN and Mask R-CNN. Explain the RPN architecture and outline how those are trained

Answer

A

RPN is a fully convolutional Network. The Input is the convolutional feature map from a bigger Neural Network, and it passes through the first convolutional 3x3 layer, that uses the slide window method, the next one is a classifier layer that predict the objectness of the window (probability of the object being in that region). The last layer is a regression layer, that predicts the coordinates of the bounding boxes.

RPN is trained using end-to-end approach, using a combination of classification ans regression loss

Question 3

Q

Apart from supervised proposal mechanisms like RPN there also exist unsupervised methods. Name two unsupervised RP mechnisms and briefly explain the idea behind them

Answer

A

Sliding Window: a small window is moved across the image and each position is evaluated for objectness.

Selective Search: Segments the image into multiple regions and combined them based on their similarity to generate object proposals

Question 4

Q

Many recent detection algorithms (Yolo-v3, SSD, RetinaNet) do not utilize region proposals. How are these methods called and what basic idea is used to make bounding box predictions?

Answer

A

Single-shot detection. The basic idea is to use convolutional layers to at the same time predict object classes and bounding box offsets for the anchors, without the need of a separate region proposal step.

Question 5

Q

The region proposal network is used to extract object proposals in the faster R-CNN architecture. Given that the number of predefined anchors is k and the input feature size map is CxWxH (C:Channels, W: width, H: height), how many proposals can be obtained?

Answer

A

Number of Proposals = k * W * H

Question 6

Q

Explain the improvements of Faster R-CNN with regard to R-CNN and Fast R-CNN

Answer

A

Faster R-CNN used the RPN instead of Selective Search, enabling end-to-end learning and making the process faster and more efficient

Question 7

Q

Name 2 differences in the objection detection pipeline of R-CNN and SSD

Answer

A

Region Proposal:
R-CNN uses selective search for region proposals. SSD generates fixed anchor boxes of various aspect ratios and scales across the entire image.

Training Approach:
R-CNN trains each stage (region proposals, CNN, and classifier) separately.
SSD trains the entire network as a single-shot detector, optimizing for both object class scores and bounding box regressions simultaneously.

Question 8

Q

Write formula precision, recall and F1 + Application

Answer

A

P = TP/(TP+FP)
R = TP / (TP + FN)
F1 = 2PR/(P+R).

The application is for classification tasks in computer vision, specially binary classification.

Detection Flashcards

(8 cards)