Abstract | ||
---|---|---|
Visual relationship recognition, as a challenging task used to distinguish the interactions between object pairs, has received much attention recently. Considering the fact that most visual relationships are semantic concepts defined by human beings, there are many human knowledge, or priors, hidden in them, which haven't been fully exploited by existing methods. In this work, we propose a novel visual relationship recognition model using language and position guided attention: language and position information are exploited and vectored firstly, and then both of them are used to guide the generation of attention maps. With the guided attention, the hidden human knowledge can be made better use to enhance the selection of spatial and channel features. Experiments on VRD [2] and VGR [1] show that, with language and position guided attention module, our proposed model achieves state-of-the-art performance. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/icassp.2019.8683464 | 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) |
Keywords | Field | DocType |
Visual Relationship Recognition, Visual Attention, Deep Neutral Networks | Pattern recognition,Computer science,Communication channel,Visual attention,Human–computer interaction,Artificial intelligence,Human knowledge,Prior probability | Conference |
ISSN | Citations | PageRank |
1520-6149 | 0 | 0.34 |
References | Authors | |
0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Hao Zhou | 1 | 8 | 2.80 |
Hu Chuanping | 2 | 356 | 18.33 |
Chongyang Zhang | 3 | 84 | 21.63 |
Shengyang Shen | 4 | 0 | 0.34 |