Non-native pronunciation variation modeling using an indirect data driven method - Citegraph

Paper Info

Title
Non-native pronunciation variation modeling using an indirect data driven method

Abstract
In this paper, we propose a pronunciation variation modeling method for improving the performance of a non-native automatic speech recognition (ASR) system that does not degrade the performance of a native ASR system. The proposed method is based on an indirect data-driven approach, where pronunciation variability is investigated from the training speech data, and variant rules are subsequently derived and applied to compensate for variability in the ASR pronunciation dictionary. To this end, native utterances are first recognized by using a phoneme recognizer, and then the variant phoneme patterns of native speech are obtained by aligning the recognized and reference phonetic sequences. The reference sequences are transcribed by using each of canonical, knowledge-based, and hand-labeled methods. Similar to non-native speech, the variant phoneme patterns of non-native speech can also be obtained by recognizing non-native utterances and comparing the recognized phoneme sequences and reference phonetic transcriptions. Finally, variant rules are derived from native and non-native variant phoneme patterns using decision trees and applied to the adaptation of a dictionary for non-native and native ASR systems. In this paper, Korean spoken by Chinese native speakers is considered as the non-native speech. It is shown from non-native ASR experiments that an ASR system using the dictionary constructed by the proposed pronunciation variation modeling method can relatively reduce the average word error rate (WER) by 18.5% when compared to the baseline ASR system using a canonical transcribed dictionary. In addition, the WER of a native ASR system using the proposed dictionary is also relatively reduced by 1.1%, as compared to the baseline native ASR system with a canonical constructed dictionary.

Year	DOI	Venue
2007	10.1109/ASRU.2007.4430114	ASRU
Keywords	Field	DocType
nonnative pronunciation variation modeling,reference phonetic sequence,indirect data driven method,asr pronunciation dictionary,non-native speech recognition,natural languages,word error rate,pronunciation variation,learning (artificial intelligence),vocabulary,dictionaries,native speech phoneme pattern,chinese native speakers,speaker recognition,indirect data-driven approach,dictionary adaptation,phoneme recognizer,automatic speech recognition system,speech recognition,decision trees,decision tree,native speaker,knowledge base,automatic speech recognition,learning artificial intelligence	Pronunciation,Transcription (linguistics),Decision tree,Data-driven,Computer science,Word error rate,Speech recognition,Natural language,Speaker recognition,Natural language processing,Artificial intelligence,Vocabulary	Conference
ISBN	Citations	PageRank
978-1-4244-1746-9	8	0.67
References	Authors
7	3

Authors (3 rows)

Cited by (8 rows)

References (7 rows)

Name	Order	Citations	PageRank
Mina Kim	1	8	0.67
Yoo Rhee Oh	2	27	3.41
Hong Kook Kim	3	258	51.67

1