Global-Local Temporal Representations For Video Person Re-Identification - Citegraph

Paper Info

Title
Global-Local Temporal Representations For Video Person Re-Identification

Abstract
This paper proposes the Global-Local Temporal Representation (GLTR) to exploit the multi-scale temporal cues in video sequences for video person Re-Identification (ReI-D). GLTR is constructed by first modeling the short-term temporal cues among adjacent frames, then capturing the long-term relations among inconsecutive frames. Specifically, the short-term temporal cues are modeled by parallel dilated convolutions with different temporal dilation rates to represent the motion and appearance of pedestrian. The long-term relations are captured by a temporal self-attention model to alleviate the occlusions and noises in video sequences. The short and long-term temporal cues are aggregated as the final GLTR by a simple single-stream CNN. GLTR shows substantial superiority to existing features learned with body part cues or metric learning on four widely-used video ReID datasets. For instance, it achieves Rank-1 Accuracy of 87.02% on MARS dataset without reranking, better than current state-of-the art.

Year	DOI	Venue
2019	10.1109/ICCV.2019.00406	2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019)
DocType	Volume	Issue
Conference	2019	1
ISSN	Citations	PageRank
1550-5499	7	0.40
References	Authors
15	5

Authors (5 rows)

Cited by (7 rows)

References (15 rows)

Name	Order	Citations	PageRank
Jianing Li	1	21	5.35
Shiliang Zhang	2	1213	66.09
Jingdong Wang	3	4198	156.76
Wen Gao	4	11374	741.77
Qi Tian	5	6443	331.75

1