Object detection Flashcards

Question 1

Q

What is object detection?

Answer

A

Object detection detects where in the image we have objects using a bounding box and classifies them

Question 2

Q

How can we do object detection using sliding windows=

Answer

A

Do a classification on every sliding window, resulting in classification as a object or background.

Question 3

Q

Why don’t we usually use the sliding window approach for object detection?

Question 4

Q

How does the R-CNN ( Region based CNN) work?

Answer

A

Get ROI (Regions of interesest) from a sperate algorithm 2. Forward each region trough a convonet
Use SVM for classification and Bbox regression

Question 5

Q

What is the probem for the R-CNN?

Answer

A

Training and inference is slow, training is ad-hoc

Question 6

Q

What is the SPP methode for object detection?

Answer

A

Send the whole image trough a convonet.
Extract ROI from the result.
Use pooling to reduce the size of the ROI.
Use fully connected to svm for classification and fully connected to BB regression.

Question 7

Q

What is the main advantage of SPP(Spatial pyramide pooling) over R-CNN?

Answer

A

Makes testing phase faster as we only need to use convo once. Training is still slow and ad-hoc, but faster than R-CNN

Question 8

Q

What is the fast R-CNN algorithm?

Answer

A

Use convonet on image.
Do RoI proposals
Do RoI pooling
Use FC + linear for regression and FC + linear + softmax for classification

Question 9

Q

What kind of loss does the fast R-CNN use?

Answer

A

cross entropy for classification and L1 for regression

Question 10

Q

How does the Fast R-CNN RoI pooling work?

Answer

A

Divide the project proposal into 7x7 grid and do max pooling

Question 11

Q

What is the main advantage of Fast R-CNN over SPP and R-CNN?

Answer

A

It’s fully trainable, with “fast” training and inference time.

Question 12

Q

What is the main difference between Fast R-CNN and faster R-CNN?

Answer

A

Faster R-CNN trains a convo net to do proposal selection. The proposal convo net uses classification (object/not) and BB regression loss

Question 13

Q

What is the mask R-CNN?

Answer

A

It adds FCN part to the faster R-Cnn to do semantic segmentation

Question 14

Q

Describe the YOLO algorithm

Answer

A

Split image into grid (e.g. 7x7)
For each cell make two (or more) boundingboxes and predict p(object)
Each cell also predicts a class probability p(class | object)
Combine the bouding boxes and class prediction
Perform NMS

Question 15

Q

How are the bounding boxes trained in Yolo?

Answer

A

For each cell find the best bounding box, adjust it and increase confidence. For cells without objects and other boxes reduce confidence.

Object detection Flashcards

(15 cards)