Acoustic Model Training With Detecting Transcription Errors In The Training Data - Citegraph

Paper Info

Title
Acoustic Model Training With Detecting Transcription Errors In The Training Data

Abstract
As the target of Automatic Speech Recognition (ASR) has moved from clean read speech to spontaneous conversational speech, we need to prepare orthographic transcripts of spontaneous conversational speech to train acoustic models (AMs). However, it is expensive and slow to manually transcribe such speech word by word. We propose a framework to train an AM based on easy-to-make rough transcripts in which fillers and small word fragments are not precisely transcribed and some transcription errors are included. By focusing on the phone duration in the result of forced alignment between the rough transcripts and the utterances, we can automatically detect the erroneous parts in the rough transcripts. A preliminary experiment showed that we can detect the erroneous parts with moderately high recall and precision. Through ASR experiments with conversational telephone speech, we confirmed that automatic detection helped improve the performance of the AM trained with both conventional ML criteria and state-of-the-art boosted MMI criteria.

Year	Venue	Keywords
2011	12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5	Acoustic model, Rough transcripts, Lightly supervised training
Field	DocType	Citations
Training set,Pattern recognition,Computer science,Speech recognition,Artificial intelligence,Acoustic model	Conference	1
PageRank	References	Authors
0.38	8	3

Authors (3 rows)

Cited by (1 rows)

References (8 rows)

Name	Order	Citations	PageRank
Gakuto Kurata	1	107	19.06
Nobuyasu Itoh	2	65	13.19
Masafumi Nishimura	3	112	22.77

1