Object Detection Flashcards
When there are multiple bounding boxes for an object in an image, what is Intersection over Union (IoU) ratio?
It is Intersection Area / Union area
YOLO - In a NxN grid of an image, if each grid box predicts Output = [Pc, Bx, By, Bh, Bw, C1, C2, C3], what does each value stand for
Pc -> Probability the grid element contains an object
Bx, By -> midpoint of the object
Bh, Bw -> Height and Width of the object
C - Class of object
If multiple grid elements output bounding boxes for the same object, how do we identify the most appropriate bounding box using Non-Max suppression?
- Discard all object with probability < 0.6
- Pick box with largest Pc
- Discard any other box with IoU >= 0.5
What is object locaization?
It is determining the location and class of a single object in an image.
It is training and predicting [P,bx, by, bh, bz, c1,c2, c3] where P is probability of presence, b is coordinate, and c is the class of object.
What is sliding windows for object detection?
Using sliding windows of multiple sizes and using each window in a CNN to classify it.
Explain Yolo
Image is divided into 9 grid cells and object localisation is is applied in all 9 independently.
In Yolo, what if an image is spread across grid cells?
We assign the object to grid where its midpoint falls