Title
Strategies for Vietnamese keyword search
Abstract
We propose strategies for a state-of-the-art Vietnamese keyword search (KWS) system developed at the Institute for Infocomm Research (I2R). The KWS system exploits acoustic features characterizing creaky voice quality peculiar to lexical tones in Vietnamese, a minimal-resource transliteration framework to alleviate out-of-vocabulary issues from foreign loan words, and a proposed system combination scheme FusionX. We show that the proposed creaky voice quality features complement pitch-related features, reaching fusion gains of 17.7% relative (6.9% absolute). To the best of our knowledge, the proposed transliteration framework is the first reported rule-based system for Vietnamese; it outperforms statistical-approach baselines up to 14.93-36.73% relative on foreign loan word search tasks. Using FusionX to combine 3 sub-systems, the actual term-weighted value (ATWV) reaches 0.4742, exceeding the ATWV=0.3 benchmark for IARPA Babel participants in the NIST OpenKWSB Evaluation.
Year
DOI
Venue
2014
10.1109/ICASSP.2014.6854377
Acoustics, Speech and Signal Processing
Keywords
DocType
ISSN
information retrieval,knowledge based systems,natural language processing,sensor fusion,speech recognition,ATWV,FusionX scheme,Institute for Infocomm Research,KWS system,NIST OpenKWSB Evaluation,Vietnamese keyword search,acoustic features,actual term-weighted value,creaky voice quality feature,fusion gain,lexical tones,minimal-resource transliteration framework,pitch-related feature,rule-based system,audio indexing,deep neural networks (DNN),glottalization,large vocabulary continuous speech recognition (LVCSR),low-resourced languages,spoken term detection
Conference
1520-6149
Citations 
PageRank 
References 
2
0.41
0
Authors
8
Name
Order
Citations
PageRank
Nancy F. Chen112028.98
Sunil Sivadas2312.19
Boon Pang Lim3628.89
Hoang Gia Ngo484.21
Haihua Xu55511.41
Van Tung Pham6408.42
Bin Ma7444.45
Haizhou Li83678334.61