Object detection Flashcards
What is object detection?
Object detection detects where in the image we have objects using a bounding box and classifies them
How can we do object detection using sliding windows=
Do a classification on every sliding window, resulting in classification as a object or background.
Why don’t we usually use the sliding window approach for object detection?
Too slow
How does the R-CNN ( Region based CNN) work?
- Get ROI (Regions of interesest) from a sperate algorithm 2. Forward each region trough a convonet
- Use SVM for classification and Bbox regression
What is the probem for the R-CNN?
Training and inference is slow, training is ad-hoc
What is the SPP methode for object detection?
- Send the whole image trough a convonet.
- Extract ROI from the result.
- Use pooling to reduce the size of the ROI.
- Use fully connected to svm for classification and fully connected to BB regression.
What is the main advantage of SPP(Spatial pyramide pooling) over R-CNN?
Makes testing phase faster as we only need to use convo once. Training is still slow and ad-hoc, but faster than R-CNN
What is the fast R-CNN algorithm?
- Use convonet on image.
- Do RoI proposals
- Do RoI pooling
- Use FC + linear for regression and FC + linear + softmax for classification
What kind of loss does the fast R-CNN use?
cross entropy for classification and L1 for regression
How does the Fast R-CNN RoI pooling work?
Divide the project proposal into 7x7 grid and do max pooling
What is the main advantage of Fast R-CNN over SPP and R-CNN?
It’s fully trainable, with “fast” training and inference time.
What is the main difference between Fast R-CNN and faster R-CNN?
Faster R-CNN trains a convo net to do proposal selection. The proposal convo net uses classification (object/not) and BB regression loss
What is the mask R-CNN?
It adds FCN part to the faster R-Cnn to do semantic segmentation
Describe the YOLO algorithm
- Split image into grid (e.g. 7x7)
- For each cell make two (or more) boundingboxes and predict p(object)
- Each cell also predicts a class probability p(class | object)
- Combine the bouding boxes and class prediction
- Perform NMS
How are the bounding boxes trained in Yolo?
For each cell find the best bounding box, adjust it and increase confidence. For cells without objects and other boxes reduce confidence.