Title
A target-oriented phonotactic front-end for spoken language recognition
Abstract
This paper presents a strategy to optimize the phonotactic front-end for spoken language recognition. This is achieved by selecting a subset of phones from an existing phone recognizer's phone inventory such that only the phones that best discriminate each of the target languages are selected. Each such phone subset will be used to construct a target-oriented phone tokenizer (TOPT). In this study, we examine different approaches to construct such phone tokenizers for the front-end of a Parallel Phone Recognizers followed by Vector Space Modeling (PPR-VSM) system. We show that the target-oriented phone tokenizers derived from language-specific phone recognizers are more effective than the original parallel phone recognizers. Our experimental results also show that the target-oriented phone tokenizers derived from universal phone recognizers achieve better performance than those derived from language-specific phone recognizers. Using the proposed target-oriented phone tokenizers as the phonotactic front-end, the language recognition system performance is significantly improved without the need for additional training samples. We achieve an equal error rate (EER) of 1.27%, 1.42% and 2.73% on the NIST 1996, 2003 and 2007 LRE databases respectively for 30-s closed-set tests. This system is one of the subsystems in IIR's submission to NIST 2007 LRE.
Year
DOI
Venue
2009
10.1109/TASL.2009.2016731
IEEE Transactions on Audio, Speech & Language Processing
Keywords
Field
DocType
target-oriented phonotactic front-end,spoken language recognition,existing phone recognizer,phone subset,target-oriented phone,language recognition,parallel phone,phone tokenizers,parallel phone recognizer,phonotactic feature edics:,index terms: feature selection,phone inventory,original parallel phone recognizers,proposed target-oriented phone tokenizers,universal phone recognizers,universal phone recognizer,language-specific phone recognizers,target-oriented phone tokenizer,system performance,vector space model,speech processing,feature selection,vectors,speech recognition,indexing terms,front end,databases
Speech processing,Feature selection,Computer science,Word error rate,Speech recognition,Feature extraction,Phone,NIST,Artificial intelligence,Natural language processing,Lexical analysis,Spoken language
Journal
Volume
Issue
ISSN
17
7
1558-7916
Citations 
PageRank 
References 
5
0.44
34
Authors
4
Name
Order
Citations
PageRank
Rong Tong110811.33
Bin Ma260047.26
Haizhou Li33678334.61
Eng Siong Chng4970106.33