Object tracking, re-identification and registration Flashcards

1
Q

How does MDnet work?

A

Trains shared network layers and specific layer in the specific training sequence. Learns general tracked objects through this process. During testing, the last layer is discarded and reinitialized and trained online on the specific tracking sequence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How does ADnet work?

A

Predicts the action to take to move the bounding box so that the tracked object is completely encapsulated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the difference between ADnet and MDnet?

A

AD: Predicts action to nudge bounding box in the correct position:
MD: Samples N bounding boxes around the suggested center.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are some of the challenges in object tracking? Mention an overall idea of how to fix this

A
If there is only one instance of our class we track based on some strong class features. If there are several instances of the same class (i.e two faces) we can easily start tracking another example of the two classes.
This can be handled by forcing our network to learn more specific features within our object to track.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does it mean to do tracking by learning transitions?

A

You basically do reinforcement learning. The input would be the cropped image of the object you want to track. The output would be which action to take to “nudge” the bounding box in the direction where it captures the object you want to track.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe the training process of tracking network by learning transitions.

A

The training data is a state-action pair. You start with a object to track from an image. Then you offset the cropped region, thus generating training data with offset cropped image, and the inverse offset as action.

There is also possible to do online training where when you mark your target to track, can generate traning data (~300) and do online training on the specific object you want to track.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe the network architecture of the MDNet

A

Shared network except for a final layer which is specific to each sequence. During training, multiple last layers are trained, one for each sequence. Then, during online training, a new last layer is initilized and trained on the “real” data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe two ways of implementing attention in tracking based applications

A

Reciprocative Learning: Put high importance on feature inside the tracking box.

VITAL: Remove the most prominent features in bounding box, forcing the network to learn less “important”, not so general, but target specific features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain the main idea behind the SiamFC

A

One network extracts a representation of a tracking target. Another extracts a representation of different region-sizes of the next frame. A comparison of the target representation and the regions from the frame is done, giving a matching score.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly