MAVOT: Memory-Augmented Video Object Tracking

Project Page

Paper [Arxiv]

Overview

This webpage contains visual demostration for paper MAVOT: Memory-Augmented Video Object Tracking. There will be four sections illustrating different aspects of our tracker. Section 1 is a comparison between MAVOT and other trackers. Section 2 gives a visualization of the memory module of MAVOT. Section 3 shows the potential deformable ability of MAVOT. Section 4 gives more examples of MAVOT.

overview

Section 1: Comparison with other trackers

Compared with some other trackers, MAVOT is robust against total occlusion, large-scale shape changes, motion blue and chaotic background. Here we compare our results with state-of-art trackers: Continuous Convolution Operator Tracker (CCOT), Scale-and-State Aware Tracker (SSAT), Sum of Template And Pixel-wise LEarners (Staple), SiameseFC-AlexNet (SiamFC-A) and Hierarchical Convolutional Features for Visual Tracking (HCF).


fish4. The swimming fish shares a very similar appearance with the background which causes tracking difficulties even for humans. Its rapid shape change under low light makes it a very hard tracking case. However, MAVOT can still track the fish at a relatively higher accuracy than the other state-of-art trackers.

girl. The girl is totally occluded by the passing man. While state-of-art trackers later track the occluder instead, MAVOT is able to find the original target and follow the target for a very long sequence (1500 frames).

gymnastics4. While performing, the gymnast exhibits quite dramatic shape changes, but MAVOT can still locate her with a high accuracy. The background contains several other similarly-looking gymnasts which poses confusion to other trackers where they will later track the background gymnasts. In contrast, MAVOT keeps tracking the target gymnast till the very end.

motocross2. The motorbike moves and turns at a very high speed, causing image blur with its rapid change in shape. In spite of these difficulties, MAVOT is still able to keep tracking the motor bike to the end.

road. The racking bike passes under several occluders (trees) in a long sequence (558 frames). Though the bike is totally occluded by trees or other facilities, MAVOT can resume tracking the target as soon as it reappears.

wiper. The car being tracked is in poor illumination condition, with frequent partially occluded by wipers, and there is a large motion in the middle of the video. While some trackers lost track, MAVOT can still track the car, even after a few frames of incorrect tracking due to large motion


Section 2: Memory Visualization

The following videos show the change of memory content during tracking. Our write protection method ensures that the memory is not filled too fast. And after becoming full, our least-used mechanism will choose certain memory location, erase the original content and write the new memory there. The blinks in the memory blob in the video are actually changes of memory content.


Section 3: Deformability

The following videos illustrate the potential deformability of MAVOT. This tracker is a variation of evaluated MAVOT. It can track objects with large changes in shape and size, even in poor illumination condition.

bag

helicopter

singer3

tunnel


Section 4: More demo sequences

The following videos give more qualitative evaluation of MAVOT in different scenarios, including large motion, changes in shape and size, occlusion and other cases.

ball2

butterfly

fish3

marching

sphere

traffic