Title
Phonetic Segmentation Using Knowledge from Visual and Perceptual Domain.
Abstract
Accurate and automatic phonetic segmentation is crucial for several speech based applications such as phone level articulation analysis and error detection, speech synthesis, annotation, speech recognition and emotion recognition. In this paper we examine the effectiveness of using visual features obtained by processing the image spectrogram of a speech utterance, as applied to phonetic segmentation. Further, we propose a mechanism to combine the knowledge from visual and perceptual domains for automatic phonetic segmentation. This process can be considered analogous to manual phonetic segmentation. The technique was evaluated on TIMIT American English Corpus. Experimental results show significant improvements in phonetic segmentation, especially for lower tolerances of 5, 10 and 15 ms, with an absolute improvement of 8.29% for TIMIT database for a 10 ms tolerance is observed.
Year
DOI
Venue
2017
10.1007/978-3-319-64206-2_44
Lecture Notes in Artificial Intelligence
Keywords
Field
DocType
Unsupervised phonetic segmentation,Edge detection,Multi-taper,Visual phonetic segmentation
TIMIT,Speech synthesis,Segmentation,Spectrogram,Edge detection,Computer science,Utterance,Speech recognition,Phone,American English,Natural language processing,Artificial intelligence
Conference
Volume
ISSN
Citations 
10415
0302-9743
0
PageRank 
References 
Authors
0.34
11
3
Name
Order
Citations
PageRank
Bhavik B. Vachhani1224.69
Chitralekha Bhat223.13
Sunil Kumar Kopparapu34225.18