Dynamic 3-D visualization of vocal tract shaping during speech. - Citegraph

Paper Info

Title
Dynamic 3-D visualization of vocal tract shaping during speech.

Abstract
Noninvasive imaging is widely used in speech research as a means to investigate the shaping and dynamics of the vocal tract during speech production. 3-D dynamic MRI would be a major advance, as it would provide 3-D dynamic visualization of the entire vocal tract. We present a novel method for the creation of 3-D dynamic movies of vocal tract shaping based on the acquisition of 2-D dynamic data from parallel slices and temporal alignment of the image sequences using audio information. Multiple sagittal 2-D real-time movies with synchronized audio recordings are acquired for English vowel-consonant-vowel stimuli /ala/, /a.ιa/, /asa/, and /a∫a/. Audio data are aligned using mel-frequency cepstral coefficients (MFCC) extracted from windowed intervals of the speech signal. Sagittal image sequences acquired from all slices are then aligned using dynamic time warping (DTW). The aligned image sequences enable dynamic 3-D visualization by creating synthesized movies of the moving airway in the coronal planes, visualizing desired tissue surfaces and tube-shaped vocal tract airway after manual segmentation of targeted articulators and smoothing. The resulting volumes allow for dynamic 3-D visualization of salient aspects of lingual articulation, including the formation of tongue grooves and sublingual cavities, with a temporal resolution of 78 ms.

Year	DOI	Venue
2013	10.1109/TMI.2012.2230017	IEEE Trans. Med. Imaging
Keywords	Field	DocType
biomechanics,3d dynamic movies,audio information,3d dynamic visualization,2d dynamic data,3d dynamic mri,image sequence temporal alignment,vocal tract shaping,articulation,dtw,english vowel-consonant-vowel stimuli,biomedical mri,real-time magnetic resonance imaging (mri),dynamic time warping,parallel slices,image sequences,sagittal 2d real time movies,noninvasive imaging,mfcc,retrospective gating,speech,medical image processing,tube shaped vocal tract airway,speech production,mel frequency cepstral coefficients,magnetic resonance imaging,real time systems,image reconstruction,mel frequency cepstral coefficient	Computer vision,Mel-frequency cepstrum,Speech Production Measurement,Dynamic time warping,Visualization,Segmentation,Computer science,Speech recognition,Artificial intelligence,Dynamic contrast-enhanced MRI,Speech production,Vocal tract	Journal
Volume	Issue	ISSN
32	5	1558-254X
Citations	PageRank	References
1	0.40	10
Authors
5

Authors (5 rows)

Cited by (1 rows)

References (10 rows)

Name	Order	Citations	PageRank
Yinghua Zhu	1	25	1.43
Yoon-Chul Kim	2	25	3.87
Michael I. Proctor	3	39	4.63
Narayanan Shrikanth	4	5558	439.23
Krishna S. Nayak	5	27	6.60

1