Title | ||
---|---|---|
MCIC: Multimodal Conversational Intent Classification for E-commerce Customer Service |
Abstract | ||
---|---|---|
Conversational intent classification (CIC) plays a significant role in dialogue understanding, and most previous works only focus on the text modality. Nevertheless, in real conversations of E-commerce customer service, users often send images (screenshots and photos) among the text, which makes multimodal CIC a challenging task for customer service systems. To understand the intent of a multimodal conversation, it is essential to understand the content of both text and images. In this paper, we construct a large-scale dataset for multimodal CIC in the Chinese E-commerce scenario, named MCIC, which contains more than 30,000 multimodal dialogues with image categories, OCR text (the text contained in images), and intent labels. To fuse visual and textual information effectively, we design two vision-language baselines to integrate either images or OCR text with the dialogue utterances. Experimental results verify that both the text and images are important for CIC in E-commerce customer service. |
Year | DOI | Venue |
---|---|---|
2022 | 10.1007/978-3-031-17120-8_58 | NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT I |
Keywords | DocType | Volume |
Conversational intent classification, Multimodal dataset | Conference | 13551 |
ISSN | Citations | PageRank |
0302-9743 | 0 | 0.34 |
References | Authors | |
0 | 7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Shaozu Yuan | 1 | 0 | 3.04 |
Xin Shen | 2 | 0 | 0.34 |
Yuming Zhao | 3 | 0 | 0.34 |
Hang Liu | 4 | 0 | 0.34 |
Zhiling Yan | 5 | 0 | 1.69 |
Ruixue Liu | 6 | 0 | 1.69 |
Meng Chen | 7 | 0 | 0.34 |