MRI-Based Vocal Tract Representations for the Three-Dimensional Finite Element Synthesis of Diphthongs - Citegraph

Paper Info

Title
MRI-Based Vocal Tract Representations for the Three-Dimensional Finite Element Synthesis of Diphthongs

Abstract
The synthesis of diphthongs in three-dimensions 3D involves the simulation of acoustic waves propagating through a complex 3D vocal tract geometry that deforms over time. Accurate 3D vocal tract geometries can be extracted from Magnetic Resonance Imaging MRI, but due to long acquisition times, only static sounds can be currently studied with an adequate spatial resolution. In this work, 3D dynamic vocal tract representations are built to generate diphthongs, based on a set of cross-sections extracted from MRI-based vocal tract geometries of static vowel sounds. A diphthong can then be easily generated by interpolating the location, orientation and shape of these cross-sections, thus avoiding the interpolation of full 3D geometries. Two options are explored to extract the cross-sections. The first one is based on an adaptive grid AG, which extracts the cross-sections perpendicular to the vocal tract midline, whereas the second one resorts to a semi-polar grid SPG strategy, which fixes the cross-section orientations. The finite element method FEM has been used to solve the mixed wave equation and synthesize diphthongs [${\alpha i}$] and [${\alpha u}$] in the dynamic 3D vocal tracts. The outputs from a 1D acoustic model based on the Transfer Matrix Method have also been included for comparison. The results show that the SPG and AG provide very close solutions in 3D, whereas significant differences are observed when using them in 1D. The SPG dynamic vocal tract representation is recommended for 3D simulations because it helps to prevent the collision of adjacent cross-sections.

Year	DOI	Venue
2019	10.1109/TASLP.2019.2942439	IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Keywords	Field	DocType
Three-dimensional displays,Geometry,Solid modeling,Interpolation,Magnetic resonance imaging,Shape,Finite element analysis	Pattern recognition,Computer science,Finite element method,Speech recognition,Artificial intelligence,Diphthong,Vocal tract	Journal
Volume	Issue	ISSN
27	12	2329-9290
Citations	PageRank	References
0	0.34	9
Authors
4

Authors (4 rows)

Cited by (0 rows)

References (9 rows)

Name	Order	Citations	PageRank
Marc Arnela	1	7	2.66
Saeed Dabbaghchian	2	2	1.08
Oriol Guasch	3	7	3.67
Olov Engwall	4	197	30.71

1