Abstract | ||
---|---|---|
ABSTRACTHumans can easily understand storylines and character relationships in movies. However, the automatic relationship analysis from videos is challenging. In this paper, we introduce a deep video understanding system to infer relationships between movie characters from multimodal features. The proposed system first extracts visual and text features from full-length movies. With these multimodal features, we then utilize graph-based relationship reasoning models to infer the characters' relationships. We evaluate our proposed system on the High-Level Video Understanding (HLVU) dataset. We achieve 53% accuracy on question answering tests. |
Year | DOI | Venue |
---|---|---|
2020 | 10.1145/3395035.3425639 | ICMI-MLMI |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
0 | 8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yang Lu | 1 | 0 | 0.34 |
Asri Rizki Yuliani | 2 | 0 | 0.34 |
Keisuke Ishikawa | 3 | 0 | 0.34 |
Ronaldo Prata Amorim | 4 | 0 | 0.34 |
Roland Hartanto | 5 | 0 | 0.34 |
Nakamasa Inoue | 6 | 72 | 14.28 |
Kuniaki Uto | 7 | 32 | 10.40 |
Koichi Shinoda | 8 | 463 | 65.14 |