L11: Tracking Flashcards
Meanshift: What is the intial track that needs to be loaded?
ROI (Region-of-interest). This can be hand chosen or by a detection algorithm
Meanshift: Does it handle scale and orientation changes?
No, use Camshift instead
Meanshift: What is Histogram backprojection?
You take the original ROI/image’s histogram and backproject from a new frame. The black and white areas indicate how similar they are.
High values = regions that are similar
Low values = regions that are dissimilar
Meanshift: When is it applied and what do it do?
After the histogram backprojection. It shift the ROI to the new location that encapsulate most of the points, hence the mean of the points.
Hence where the name comes from
Camshift: How does it work compared to Meanshift?
Camshift (Continuously Adaptive Meanshift) essentially follow the same procedure. It does Meanshift. Afterwards it tries to rotate and fit and elipsoid to the distribution of over points. If the elipsoid increases with some significant size it will update the ROI.
SORT: What are the 4 stages of it?
- Localization of track identities using an external detector.
- Kalman filtering for prediction
- Bounding box assignment
- Dynamic creation/deletion of track identities
SORT: what detector can be used?
Essentially all that generate a bounding box and sufficient data for the Kalman filter.
The original used detector is the Faster R-CNN (Deep Neural Network) trained to locate pedestrians / people
SORT: What is the parameterization of the bounding box?
It is: x=[u,v,r,u^dot,v^dot,s_dot]^t
(u,v) = center of bounding box
r = the aspect ratio
s = the area
SORT: why do we need the r term? The aspect ratio term
Because the method relies on constant aspect ratio between two consecutive frames. This also makes the prediction trask for the filter slightly easier.
SORT: What is the assignment problem?
At each frame our detector will generate bounding boxes (N number) aswell as our estimate from the kalman filter (M number).
The assignment problem is then, how do we assign an object to the right bounding box?
We use an cost matrix N x M and compute the IoU score.
Essentially, meaning how well does a detector bounding box overlap with all of the estimated bounding boxes and vice versa.
SORT: what if the square matrix (NxM) is unequal?
Simply add a dummy rows/columns with zero entries
SORT: Which algorithm performs optimal assignment? and how?
The hungarian algorithm: It does so by permutes the columns/rows of the cost matrix until the trace becomes minimal
SORT: What is the drawbacks?
- Complexity
- The runtime is cubic in the number of rows
SORT: When does a new track get created or an old deleted?
If a new object has been seen for a couple of frames it will be given a new track.
Vice-versa for deletion, if an object has left the scene for some couple of frames the track will be removed.
SORT: What is the kalman filter step used for?
The Kalman filter is used to predict the expected position and velocity of each existing track in the current frame based on its previous state.