Title
Personalizing TTS Voices for Progressive Dysarthria
Abstract
Amyotrophic lateral sclerosis (ALS) patients experience progressive speech deterioration due to muscle paralysis, leading to eventual loss of verbal communication capability. Text-to-speech synthesis (TTS) is an important technology for speech generating devices, enabling users to communicate using generic electronic voices, but often without the vocal identity of the users. Our work is aimed at personalizing TTS voices for people with ALS induced dysarthria by integrating machine learning and speech processing techniques of voice conversion (VC) and TTS. This is challenging as only small quantities of dysarthric speech are available from individual patients. Our system includes both timbre and prosody conversion for VC, neural TTS to generate TTS speech, and neural feature converter to interface VC and TTS. We collected speech data from 4 ALS target speakers with mild to severe dysarthria. Subjective listening tests showed that on average, our approach improved speech intelligibility by about 72% over the target speakers’ speech, the converted voice was 2 to 3 times more similar to ALS targets than to TTS sources, and the converted speech quality was in the MOS scale of fair to good.
Year
DOI
Venue
2021
10.1109/BHI50953.2021.9508522
2021 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI)
Keywords
DocType
ISSN
voice conversion,feature conversion,neural TTS,dysarthria,amyotrophic lateral sclerosis
Conference
2641-3590
ISBN
Citations 
PageRank 
978-1-6654-4770-6
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Yunxin Zhao1807121.74
Minguang Song202.37
Yanghao Yue300.34
Mili Kuruvilla-Dugdale400.34