Title
MCIC: Multimodal Conversational Intent Classification for E-commerce Customer Service
Abstract
Conversational intent classification (CIC) plays a significant role in dialogue understanding, and most previous works only focus on the text modality. Nevertheless, in real conversations of E-commerce customer service, users often send images (screenshots and photos) among the text, which makes multimodal CIC a challenging task for customer service systems. To understand the intent of a multimodal conversation, it is essential to understand the content of both text and images. In this paper, we construct a large-scale dataset for multimodal CIC in the Chinese E-commerce scenario, named MCIC, which contains more than 30,000 multimodal dialogues with image categories, OCR text (the text contained in images), and intent labels. To fuse visual and textual information effectively, we design two vision-language baselines to integrate either images or OCR text with the dialogue utterances. Experimental results verify that both the text and images are important for CIC in E-commerce customer service.
Year
DOI
Venue
2022
10.1007/978-3-031-17120-8_58
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT I
Keywords
DocType
Volume
Conversational intent classification, Multimodal dataset
Conference
13551
ISSN
Citations 
PageRank 
0302-9743
0
0.34
References 
Authors
0
7
Name
Order
Citations
PageRank
Shaozu Yuan103.04
Xin Shen200.34
Yuming Zhao300.34
Hang Liu400.34
Zhiling Yan501.69
Ruixue Liu601.69
Meng Chen700.34