Title
Tal: A Synchronised Multi-Speaker Corpus Of Ultrasound Tongue Imaging, Audio, And Lip Videos
Abstract
We present the Tongue and Lips corpus (TaL), a multi-speaker corpus of audio, ultrasound tongue imaging, and lip videos. TaL consists of two parts: TaL1 is a set of six recording sessions of one professional voice talent, a male native speaker of English; TaL80 is a set of recording sessions of 81 native speakers of English without voice talent experience. Overall, the corpus contains 24 hours of parallel ultrasound, video, and audio data, of which approximately 13.5 hours are speech. This paper describes the corpus and presents benchmark results for the tasks of speech recognition, speech synthesis (articulatory-to-acoustic mapping), and automatic synchronisation of ultrasound to audio. The TaL corpus is publicly available under the CC BY-NC 4.0 license.
Year
DOI
Venue
2021
10.1109/SLT48900.2021.9383619
2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT)
Keywords
DocType
ISSN
Ultrasound tongue imaging, video lip imaging, silent speech, articulography, corpora
Conference
2639-5479
Citations 
PageRank 
References 
0
0.34
0
Authors
7
Name
Order
Citations
PageRank
Manuel Sam Ribeiro174.32
Jennifer Sanger200.34
Jing-Xuan Zhang3133.92
Aciel Eshky400.34
Alan Wrench500.34
Korin Richmond653146.14
Steve Renals72570293.02