Title
Voice Timbre Control Based On Perceived Age In Singing Voice Conversion
Abstract
The perceived age of a singing voice is the age of the singer as perceived by the listener, and is one of the notable characteristics that determines perceptions of a song. In this paper, we describe an investigation of acoustic features that have an effect on the perceived age, and a novel voice timbre control technique based on the perceived age for singing voice conversion (SVC). Singers can sing expressively by controlling prosody and voice timbre, but the varieties of voices that singers can produce are limited by physical constraints. Previous work has attempted to overcome this limitation through the use of statistical voice conversion. This technique makes it possible to convert singing voice timbre of an arbitrary source singer into those of an arbitrary target singer. However, it is still difficult to intuitively control singing voice characteristics by manipulating parameters corresponding to specific physical traits, such as gender and age. In this paper, we first perform an investigation of the factors that play a part in the listener's perception of the singer's age at first. Then, we applied a multiple-regression Gaussian mixture models (MR-GMM) to SVC for the purpose of controlling voice timbre based on the perceived age and we propose SVC based on the modified MR-GMM for manipulating the perceived age while maintaining singer's individuality. The experimental results show that I) the perceived age of singing voices corresponds relatively well to the actual age of the singer, 2) prosodic features have a larger effect on the perceived age than spectral features, 3) the individuality of a singer is influenced more heavily by segmental features than prosodic features 4) the proposed voice timbre control method makes it possible to change the singer's perceived age while not having an adverse effect on the perceived individuality.
Year
DOI
Venue
2014
10.1587/transinf.E97.D.1419
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS
Keywords
Field
DocType
singing voice, voice conversion, perceived age, spectral and prosodic features, subjective evaluations
Prosody,Voice analysis,Computer science,Speech recognition,Singing,Perception,Timbre,Mixture model
Journal
Volume
Issue
ISSN
E97D
6
1745-1361
Citations 
PageRank 
References 
7
0.51
13
Authors
8
Name
Order
Citations
PageRank
Kazuhiro Kobayashi1669.91
Tomoki Toda21874167.18
Hironori Doi3453.34
Tomoyasu Nakano411516.84
masataka goto52258213.22
Graham Neubig6989130.31
Sakriani Sakti725765.02
Satoshi Nakamura81099194.59