Joint Discriminative Decoding of Words and Semantic Tags for Spoken Language Understanding - Citegraph

Paper Info

Title
Joint Discriminative Decoding of Words and Semantic Tags for Spoken Language Understanding

Abstract
Most Spoken Language Understanding (SLU) systems today employ a cascade approach, where the best hypothesis from Automatic Speech Recognizer (ASR) is fed into understanding modules such as slot sequence classifiers and intent detectors. The output of these modules is then further fed into downstream components such as interpreter and/or knowledge broker. These statistical models are usually trained individually to optimize the error rate of their respective output. In such approaches, errors from one module irreversibly propagates into other modules causing a serious degradation in the overall performance of the SLU system. Thus it is desirable to jointly optimize all the statistical models together. As a first step towards this, in this paper, we propose a joint decoding framework in which we predict the optimal word as well as slot sequence (semantic tag sequence) jointly given the input acoustic stream. Furthermore, the improved recognition output is then used for an utterance classification task, specifically, we focus on intent detection task. On a SLU task, we show 1.5% absolute reduction (7.6% relative reduction) in word error rate (WER) and 1.2% absolute improvement in F measure for slot prediction when compared to a very strong cascade baseline comprising of state-of-the-art large vocabulary ASR followed by conditional random field (CRF) based slot sequence tagger. Similarly, for intent detection, we show 1.2% absolute reduction (12% relative reduction) in classification error rate.

Year	DOI	Venue
2013	10.1109/TASL.2013.2256894	IEEE Transactions on Audio, Speech & Language Processing
Keywords	Field	DocType
speech recognition,speech coding,random processes,natural language processing,decoding,statistical analysis	Speech coding,Computer science,Artificial intelligence,Natural language processing,Discriminative model,Spoken language,Conditional random field,Pattern recognition,Word error rate,Speech recognition,Statistical model,Decoding methods,Vocabulary	Journal
Volume	Issue	ISSN
21	8	1558-7916
Citations	PageRank	References
8	0.48	19
Authors
4

Authors (4 rows)

Cited by (8 rows)

References (19 rows)

Name	Order	Citations	PageRank
Anoop Deoras	1	240	29.36
Gokhan Tur	2	931	83.35
Ruhi Sarikaya	3	698	64.49
Dilek Z. Hakkani-Tür	4	387	43.99

1