Abstract | ||
---|---|---|
In this paper we describe the compression of diphone inventories used by the acoustic synthesis of a concatenative synthesis system. The inventory compression is based on a codebook drawn from the Gaussian mean vectors of phoneme HMMs. There are two encoding/synthesis schemes, a speaker dependent and a speaker independent one. The advantage of the latter is the potential common use of the HM-models by a recognizer and a synthesizer. We describe the steps to encode the inventories as well as the acoustic synthesis using them. Using the proposed method a diphone inventory with 1175 units can be compressed down to 19 kB. We will show that the synthesis quality with HMM-encoded inventories matches the quality of synthesis with AMRor SPEEX-encoded inventories at noticeably smaller inventory sizes. |
Year | DOI | Venue |
---|---|---|
2011 | 10.1109/ICASSP.2011.5947574 | Acoustics, Speech and Signal Processing |
Keywords | Field | DocType |
Gaussian processes,data compression,hidden Markov models,speech coding,speech synthesis,vectors,AMR-encoded inventory,Gaussian mean vector,HMM based diphone inventory encoding,SPEEX-encoded inventory,acoustic synthesis,codebook,concatenative synthesis system,diphone inventory compression,low-resource device,phoneme HMM,speaker dependent,speaker independent,speech synthesis,Hidden Markov Models,Speech coding,Speech synthesis | Concatenative synthesis,Speech synthesis,Speech coding,Diphone,Pattern recognition,Computer science,Speech recognition,Artificial intelligence,Data compression,Hidden Markov model,Encoding (memory),Codebook | Conference |
ISSN | ISBN | Citations |
1520-6149 E-ISBN : 978-1-4577-0537-3 | 978-1-4577-0537-3 | 3 |
PageRank | References | Authors |
0.40 | 5 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Guntram Strecha | 1 | 19 | 4.17 |
Matthias Wolff | 2 | 68 | 14.17 |