Abstract | ||
---|---|---|
ABSTRACTPrevailing Multiple Object Tracking (MOT) works following the Tracking-by-Detection (TBD) paradigm pay most attention to either object detection in a first step or data association in a second step. In this paper, we approach the MOT problem from a different perspective by directly obtaining the embedded spatial-temporal information of trajectories from raw video data. For the purpose we propose a joint trajectory locating and attributes encoding framework for real-time, on-line MOT. We firstly introduce a trajectory attribute representation scheme designed for each tracked target (instead of object) where the extracted Trajectory Map (TM) encodes the spatial-temporal attributes of a trajectory across a window of consecutive video frames. Next we present a Temporal Priors Embedding (TPE) methodology to infer these attributes with a logical reasoning strategy based on long-term feature dynamics. The proposed MOT framework projects multiple attributes of tracked targets, e.g., presence, enter/exit, location, scale, motion, etc. into a continuous TM to perform one-shot regression for real-time MOT. Experimental results show that, our proposed video-based method runs at 33 FPS and is more accurate and robust as compared to the detection-based tracking methods and a few other State-of-the- Art (SOTA) approaches on MOT16/17/20 benchmarks. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1145/3474085.3475304 | International Multimedia Conference |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Xingyu Wan | 1 | 1 | 1.02 |
Sanping Zhou | 2 | 0 | 0.34 |
Jinjun Wang | 3 | 0 | 0.34 |
Rongye Meng | 4 | 0 | 0.34 |