Abstract | ||
---|---|---|
A new approach for text-independent phoneme segmentation at sampling point level is proposed in this paper. The algorithm consists of two phases: First, the voiced sections in speech data are detected using the information of vocal folds vibration contained in electroglottograph (EGG). A Hilbert envelope feature is adopted to achieve sampling point level detection accuracy. Second, the voiced sections and other sections are treated separately. Each voiced section is divided into several candidate phonemes using the Viterbi algorithm. Then adjacent candidate phonemes are merged based on a Hotellings T-square test method. For other sections, the unvoiced consonants are detected from silence based on a singularity exponent feature. Comparison experiments show that the proposed method has better performance than the existing ones for a variety of tolerances, and is more robust to noise. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1109/TASLP.2016.2533865 | IEEE/ACM Trans. Audio, Speech & Language Processing |
Keywords | Field | DocType |
Speech,Speech processing,Vibrations,IEEE transactions,Feature extraction,Hidden Markov models,Viterbi algorithm | Electroglottograph,Speech processing,Pattern recognition,Segmentation,Computer science,Singularity,Feature extraction,Speech recognition,Artificial intelligence,Sampling (statistics),Hidden Markov model,Viterbi algorithm | Journal |
Volume | Issue | ISSN |
24 | 6 | 2329-9290 |
Citations | PageRank | References |
3 | 0.39 | 16 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Lijiang Chen | 1 | 304 | 23.22 |
Xia Mao | 2 | 188 | 21.89 |
Hong Yan | 3 | 3628 | 335.04 |