Abstract | ||
---|---|---|
In this paper, we investigate the cause of the high false positive rate in Visual Relationship Detection (VRD). We observe that during training, the relationship proposal distribution is highly imbalanced: most of the negative relationship proposals are easy to identify, e.g., the inaccurate object detection, which leads to the under-fitting of low-frequency difficult proposals. This paper presents Spatially-Aware Balanced negative pRoposal sAmpling (SABRA), a robust VRD framework that alleviates the influence of false positives. To effectively optimize the model under imbalanced distribution, SABRA adopts Balanced Negative Proposal Sampling (BNPS) strategy for mini-batch sampling. BNPS divides proposals into 5 well defined sub-classes and generates a balanced training distribution according to the inverse frequency. BNPS gives an easier optimization landscape and significantly reduces the number of false positives. To further resolve the low-frequency challenging false positive proposals with high spatial ambiguity, we improve the spatial modeling ability of SABRA on two aspects: a simple and efficient multi-head heterogeneous graph attention network (MH-GAT) that models the global spatial interactions of objects, and a spatial mask decoder that learns the local spatial configuration. SABRA outperforms SOTA methods by a large margin on two human-object interaction (HOI) datasets and one general VRD dataset. |
Year | Venue | DocType |
---|---|---|
2021 | British Machine Vision Conference | Conference |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
11 |
Name | Order | Citations | PageRank |
---|---|---|---|
Daisheng Jin | 1 | 0 | 0.34 |
Xiao Ma | 2 | 1 | 2.41 |
Chongzhi Zhang | 3 | 2 | 1.04 |
Yizhuo Zhou | 4 | 0 | 0.34 |
Jiashu Tao | 5 | 0 | 0.34 |
Mingyuan Zhang | 6 | 0 | 1.01 |
Haiyu Zhao | 7 | 65 | 6.28 |
Shuai Yi | 8 | 167 | 14.21 |
Zhoujun Li | 9 | 964 | 115.99 |
Xianglong Li | 10 | 0 | 0.68 |
Hongsheng Li | 11 | 1516 | 85.29 |