Human-Object Interaction Detection via Disentangled Transformer - Citegraph

Paper Info

Title
Human-Object Interaction Detection via Disentangled Transformer

Abstract
Human-Object Interaction Detection tackles the problem of joint localization and classification of human object interactions. Existing HOI transformers either adopt a single decoder for triplet prediction, or utilize two parallel decoders to detect individual objects and interactions separately, and compose triplets by a matching process. In contrast, we decouple the triplet prediction into human-object pair detection and interaction classification. Our main motivation is that detecting the human-object instances and classifying interactions accurately needs to learn representations that focus on different regions. To this end, we present Disentangled Transformer, where both encoder and decoder are disentangled to facilitate learning of two sub-tasks. To associate the predictions of disentangled decoders, we first generate a unified representation for HOI triplets with a base decoder, and then utilize it as input feature of each disentangled decoder. Extensive experiments show that our method outperforms prior work on two public HOI benchmarks by a sizeable margin. Code will be available.

Year	DOI	Venue
2022	10.1109/CVPR52688.2022.01896	IEEE Conference on Computer Vision and Pattern Recognition
Keywords	DocType	Volume
Scene analysis and understanding, Deep learning architectures and techniques, Recognition: detection,categorization,retrieval, Segmentation,grouping and shape analysis	Conference	2022
Issue	Citations	PageRank
1	0	0.34
References	Authors
0	7

Authors (7 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Desen Zhou	1	64	2.90
Zhichao Liu	2	0	0.34
Jian Wang	3	7	6.40
Leshan Wang	4	0	0.34
Tao Hu	5	0	0.68
Er-rui Ding	6	142	29.31
Jingdong Wang	7	0	0.34

1