Title
Exploring ASR-free end-to-end modeling to improve spoken language understanding in a cloud-based dialog system
Abstract
Spoken language understanding (SLU) in dialog systems is generally performed using a natural language understanding (NLU) model based on the hypotheses produced by an automatic speech recognition (ASR) system. However, when new spoken dialog applications are built from scratch in real user environments that often have sub-optimal audio characteristics, ASR performance can suffer due to factors such as the paucity of training data or a mismatch between the training and test data. To address this issue, this paper proposes an ASR-free, end-to-end (E2E) modeling approach to SLU for a cloud-based, modular spoken dialog system (SDS). We evaluate the effectiveness of our approach on crowdsourced data collected from non-native English speakers interacting with a conversational language learning application. Experimental results show that our approach is particularly promising in situations with low ASR accuracy. It can further improve the performance of a sophisticated CNN-based SLU system with more accurate ASR hypotheses by fusing the scores from E2E system, i.e., the overall accuracy of SLU is improved from 85.6% to 86.5%.
Year
DOI
Venue
2017
10.1109/ASRU.2017.8268987
2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
Keywords
Field
DocType
end-to-end,spoken language understanding
Dialog box,Computer science,Language acquisition,Natural language understanding,Natural language processing,Artificial intelligence,Dialog system,Test data,Modular design,Spoken language,Cloud computing
Conference
ISBN
Citations 
PageRank 
978-1-5090-4789-5
2
0.46
References 
Authors
0
7
Name
Order
Citations
PageRank
Qian Yao152751.55
Rutuja Ubale223.17
Vikram Ramanarayanan37013.97
Patrick Lange498.42
David Suendermann-Oeft532.17
Keelan Evanini67920.23
Eugene Tsuprun731.16