Title
Multimodal Named Entity Recognition for Short Social Media Posts.
Abstract
We introduce a new task called Multimodal Named Entity Recognition (MNER) for noisy user-generated data such as tweets or Snapchat captions, which comprise short text with accompanying images. These social media posts often come in inconsistent or incomplete syntax and lexical notations with very limited surrounding textual contexts, bringing significant challenges for NER. To this end, we create a new dataset for MNER called SnapCaptions (Snapchat image-caption pairs submitted to public and crowd-sourced stories with fully annotated named entities). We then build upon the state-of-the-art Bi-LSTM word/character based NER models with 1) a deep image network which incorporates relevant visual context to augment textual information, and 2) a generic modality-attention module which learns to attenuate irrelevant modalities while amplifying the most informative ones to extract contexts from, adaptive to each sample and token. The proposed MNER model with modality attention significantly outperforms the state-of-the-art text-only NER models by successfully leveraging provided visual contexts, opening up potential applications of MNER on myriads of social media platforms.
Year
DOI
Venue
2018
10.18653/v1/N18-1078
north american chapter of the association for computational linguistics
DocType
Volume
Citations 
Conference
abs/1802.07862
4
PageRank 
References 
Authors
0.47
20
3
Name
Order
Citations
PageRank
Seungwhan Moon1193.11
Leonardo Neves286.10
Vitor R. Carvalho367236.38