Title
Analysis-by-synthesis low-rate multimode harmonic speech coding
Abstract
This paper presents an analysis-by-synthesis multimode har- monic coder (AbS-MHC) that employs new techniques to improve both the speech model accuracy and the parameter estimation ro- bustness in the low rate harmonic coding framework. To improve the speech model accuracy, an enhanced frequency domain transi- tion model is used in conjunction with the sinusoidal model based harmonic coding of voiced/unvoiced speech signals. To achieve robust parameter estimation, a generalized analysis-by-synthesis parameter estimation scheme in the harmonic coding framework is proposed. This scheme uses a time scale signal modification technique to allow for waveform matching in harmonic coding. This concept is demonstrated in our AbS-MHC coder with a spe- cific method for efficient closed-loop pitch estimation and speech classification. The speech quality of the unquantized AbS-MHC coder is better than the 6.3 kbps G.723 quality. ily controlled due to the open-loop parameter estimation typical of harmonic coders. Therefore, in designing the proposed analysis- by-synthesis multimode harmonic coder (AbS-MHC), we try to overcome the above limitations of harmonic coders by employ- ing new techniques to improve both the speech model accuracy and the parameter estimation robustness. To improve the speech model accuracy, an enhanced frequency domain transition model is used in conjunction with the sinusoidal model based harmonic coding of voiced/unvoiced speech signals. To achieve robust pa- rameter estimation, a generalized analysis-by-synthesis parameter estimation scheme in the harmonic coding framework is proposed. This scheme employs a time scale signal modification technique to allow for waveform matching in harmonic coding. This concept is demonstrated in our AbS-MHC coder with a specific method for efficient closed-loop pitch estimation and speech classification. Subjective test results show that the speech quality of the un- quantized AbS-MHC coder exceeds that of G.723 coder at 6.3 kbps. Initial efforts towards a fully quantized 4 kbps coder have produced the speech quality which is comparable to G.723 coder operating at 5.3 kbps. Particularly, for the modified IRS filtered speech, the speech quality of the 4 kbps AbS-MHC coder is better than that of G.723 at 5.3 kbps.
Year
Venue
Keywords
1999
EUROSPEECH
speech coding,parameter estimation,analysis by synthesis,frequency domain
Field
DocType
Citations 
Vector sum excited linear prediction,Speech coding,Computer science,Harmonic,Speech recognition,Sub-band coding,Multi-mode optical fiber,Harmonic Vector Excitation Coding,Codec2,Linear predictive coding
Conference
3
PageRank 
References 
Authors
0.57
3
3
Name
Order
Citations
PageRank
Chunyan Li130.57
Allen Gersho230.57
Vladimir Cuperman35611.32