Title
Acoustic Model Training With Detecting Transcription Errors In The Training Data
Abstract
As the target of Automatic Speech Recognition (ASR) has moved from clean read speech to spontaneous conversational speech, we need to prepare orthographic transcripts of spontaneous conversational speech to train acoustic models (AMs). However, it is expensive and slow to manually transcribe such speech word by word. We propose a framework to train an AM based on easy-to-make rough transcripts in which fillers and small word fragments are not precisely transcribed and some transcription errors are included. By focusing on the phone duration in the result of forced alignment between the rough transcripts and the utterances, we can automatically detect the erroneous parts in the rough transcripts. A preliminary experiment showed that we can detect the erroneous parts with moderately high recall and precision. Through ASR experiments with conversational telephone speech, we confirmed that automatic detection helped improve the performance of the AM trained with both conventional ML criteria and state-of-the-art boosted MMI criteria.
Year
Venue
Keywords
2011
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5
Acoustic model, Rough transcripts, Lightly supervised training
Field
DocType
Citations 
Training set,Pattern recognition,Computer science,Speech recognition,Artificial intelligence,Acoustic model
Conference
1
PageRank 
References 
Authors
0.38
8
3
Name
Order
Citations
PageRank
Gakuto Kurata110719.06
Nobuyasu Itoh26513.19
Masafumi Nishimura311222.77