Title
Novel Applications Of Neural Networks In Speech Technology Systems: Search Space Reduction And Prosodic Modeling
Abstract
Neural networks (NNs) have been extensively used in speech technology systems. In this paper. we present two novel applications of NNs in speech recognition and text-to-speech systems.In very large vocabulary speech recognition systems using the hypothesis-veri fit cation paradigm, the verification stage is usually the most time consuming. State of the art systems combine fixed size hypothesized search spaces with advanced pruning techniques. We propose a novel strategy to dynamically calculate the hypothesized search space, using neural networks as the estimation module and designing the input feature set with a careful greedy-based selection approach. The main achievement has been a statistically significant relative decrease in error rate of 33.53%, while getting a relative decrease in average computational demands of up to 19.40%.The prosodic modeling is one of the most important tasks for developing a new text-to-speech synthesizer, especially in a female-voice high-quality restricted-domain system. Our double objective is to get accurate predictors for both the fundamental frequency (F0) curve and phoneme duration by minimizing the model estimation error in a Spanish text-to-speech system, by means of a neural network estimator, which has proved to be an excellent tool for the modeling. The resulting system predicts prosody with very good results (for duration: 15.5 ms in RMS and a correlation factor of 0.8975: for F0: 19.80 Hz in RMS and a relative RMS error of 0.43) that clearly improves our previous rule-based system.
Year
Venue
Keywords
2009
INTELLIGENT AUTOMATION AND SOFT COMPUTING
Speech recognition, neural networks, search space reduction, hypothesis-verification systems, greedy methods, feature set selection, prosody, F0 modeling, duration modeling, text-to-speech, parameter coding
Field
DocType
Volume
Prosody,Fundamental frequency,Computer science,Time delay neural network,Artificial intelligence,Artificial neural network,Speech technology,Speech synthesis,Pattern recognition,Speech recognition,Root-mean-square deviation,Machine learning,Estimator
Journal
15
Issue
ISSN
Citations 
4
1079-8587
0
PageRank 
References 
Authors
0.34
7
10
Name
Order
Citations
PageRank
J. MACIAS-GUARASA1334.51
Juan Manuel Montero221831.51
J. FERREIROS311214.84
Ricardo De Córdoba414225.58
R. SAN-SEGUNDO513914.28
J. GUTIERREZ-ARRIOLA620.70
L. F. D'HARO7332.83
F. FERNANDEZ800.34
R. BARRA910.69
J. M. PARDO1000.34