Title
Description-Based Person Search With Multi-Grained Matching Networks
Abstract
Description-based person search aims to retrieve a person in the image database based on a description about that person. It is a challenging task since the visual image and the textual description belong to different modalities. To fully capture the relevance between person images and textual descriptions, we propose a multigrained framework with three branches for visual-textual matching. Specifically, in the global-grained branch, we extract global contexts from the entire images and descriptions. In the fine-grained branch, we adopt visual human parsing and linguistic parsing to split images and descriptions into semantic components related to different body parts. We design two attention mechanisms including segmentation-based and linguistics-based attention to align visual and textual semantic components for fine-grained matching. To further exploit the spatial relations between fine-grained semantic components, we construct a body graph in the coarse-grained branch and exploit graph convolutional neural networks to aggregate fine-grained components into coarsegrained representations. The visual and textual representations learned by three branches are complementary to each other which enhance the visual-textual matching performance. Experimental results on the CUHK-PEDES dataset show that our approach performs favorably against state-of-the-art description-based person search methods.
Year
DOI
Venue
2021
10.1016/j.displa.2021.102039
DISPLAYS
Keywords
DocType
Volume
Description-based person search, Visual-textual matching, Cross-modal matching, Attention mechanism, Multi-grained matching networks
Journal
69
ISSN
Citations 
PageRank 
0141-9382
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Ji Zhu100.34
Hua Yang223.43
Jia Wang300.34
Wenjun Zhang41789177.28