Title
Leveraging word confusion networks for named entity modeling and detection from conversational telephone speech
Abstract
Named Entity (NE) detection from Conversational Telephone Speech (CTS) is important from business aspects. However, results of Automatic Speech Recognition (ASR) inevitably contain errors and this makes NE detection from CTS more difficult than from written text. One of the options to detect NEs is to use a statistical NE model. In order to capture the nature of ASR errors, the NE model is usually trained with the ASR one-best results instead of manually transcribed text and then is applied to the ASR one-best results of speech that contain NEs. To make NE detection more robust to ASR errors, we propose using Word Confusion Networks (WCNs), sequences of bundled words, for both NE modeling and detection by regarding the word bundles as units instead of the independent words. We realize this by clustering similar word bundles that may originate from the same word. We trained the NE models that predict the NE tag sequences from the sequence of the word bundles with the maximum entropy principle. Note that clustering of word bundles is conducted in advance of NE modeling and thus our proposed method can combine with any NE modeling method. We conducted experiments using real-life call-center data. The experimental results showed that by using the WCNs, the accuracy of NE detection improved regardless of the NE modeling method.
Year
DOI
Venue
2012
10.1016/j.specom.2011.11.002
Speech Communication
Keywords
Field
DocType
asr one-best result,asr error,leveraging word confusion network,statistical ne model,conversational telephone speech,ne detection,ne modeling method,ne modeling,ne model,independent word,word bundle,entity modeling,ne tag sequence,maximum entropy model
Confusion,Computer science,Speech recognition,Named entity,Natural language processing,Artificial intelligence,Principle of maximum entropy,Cluster analysis
Journal
Volume
Issue
ISSN
54
3
0167-6393
Citations 
PageRank 
References 
3
0.42
37
Authors
5
Name
Order
Citations
PageRank
Gakuto Kurata110719.06
Nobuyasu Itoh26513.19
Masafumi Nishimura311222.77
Abhinav Sethy436331.16
Bhuvana Ramabhadran51779153.83