Title
DSSF: Dynamic Semantic Sampling and Fusion for One-Stage Human-Object Interaction Detection
Abstract
Human-object interaction (HOI) detection is a fundamental task for machines to understand high-level scenarios. Nowadays, most conventional HOI detectors are usually computationally inefficient due to the decoupled two-stage manner. Although the limitation of cascaded object detection is broken through by the latest one-stage methods of direct parallel detection for HOI triplets, the problem of insufficient extracted features and lack of HOI specific interactive semantic context still exists. Therefore, an improved one-stage framework is proposed in this article, in which dynamic semantic sampling and fusion (DSSF) play key roles. Given the observation that numerous HOI classes have very limited positive samples, we tackle the long-tailed challenge by sampling dynamically using interaction semantic regularities. Meanwhile, the semantic expression ability of interaction points is explicitly enhanced by the dynamic semantic fusion (DSF) with aggregated contextual information. Moreover, we design a feature fusion module for parallel branches to reduce the conflict of multitask optimization, along with a point matching strategy (PMS) to filter out low-probability HOI pairs. Finally, without introducing extra features, our DSSF outperforms previous state-of-the-art methods by a large scale on two challenging HOI detection benchmarks: V-COCO and HICO-DET.
Year
DOI
Venue
2022
10.1109/TIM.2022.3176899
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT
Keywords
DocType
Volume
Feature extraction, Semantics, Object detection, Detectors, Visualization, Proposals, Neural networks, Human-object interaction (HOI), long-tailed, one stage, semantic fusion
Journal
71
ISSN
Citations 
PageRank 
0018-9456
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Dongzhou Gu100.68
Shiwei Ma213621.79
Shuang Cai300.68