Lecture 5 - Object Detection Flashcards
Intersection over Union
Boxes that overlap a lot = better
Bounding Box Regression
Loss function comparing the predicted and ground truth boxes
Non-Max Suppression
Avoids repeated detections of the same instance
Time consuming as involves 3 models (CNN, SVM and regression)
Fast R-CNN
Generate feature map for whole image
Replace CNN+SVM+Regression with multi task CNN
RoI max pooling
Converts the RoI candidates into fixed-size feature map.
Fast R-CNN: Location Loss
Lloc(tu,v) = sum(i in {x,y,w,h})(smoothL1(tui-vi))
Equations clearer on slide. v is the ground truth and t is the predicted box
The loss value is 0 when predicted equals actual
Key Fast R-CNN Steps
- Selective search to propose 2000 region candidates for image
- Apply CNN to gen feature map
- Extract relative region for each candidate on the map
- Apply ROI pooling (to -> fixed size map)
- Map the feature map to object class and location
Faster R-CNN
Integrates region proposal algorithm into the CNN ( small additional model)
Training faster R-CNN
- Train RPN (Region Proposal Network)
- Train FaST R-CNN
- Fix shared convolutional layers and train RPN
- Fix … and train Fast R-CNN