Title
No-Frills Human-Object Interaction Detection: Factorization, Appearance and Layout Encodings, and Training Techniques.
Abstract
We show that with an appropriate factorization, and encodings of layout and appearance constructed from outputs of pretrained object detectors, a relatively simple model outperforms more sophisticated approaches on human-object interaction detection. Our model includes factors for detection scores, human and object appearance, and coarse (box-pair configuration) and optionally fine-grained layout (human pose). We also develop training techniques that improve learning efficiency by: (i) eliminating train-inference mismatch; (ii) rejecting easy negatives during mini-batch training; and (iii) using a ratio of negatives to positives that is two orders of magnitude larger than existing approaches while constructing training mini-batches. We conduct a thorough ablation study to understand the importance of different factors and training techniques using the challenging HICO-Det dataset.
Year
Venue
DocType
2018
arXiv: Computer Vision and Pattern Recognition
Journal
Volume
Citations 
PageRank 
abs/1811.05967
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Tanmay Gupta122.41
Alexander G. Schwing269651.78
Derek Hoiem34998302.66