Abstract | ||
---|---|---|
Conversational speech recognition has served as a flagship speech recognition task since the release of the Switchboard corpus in the 1990s. In this paper, we measure a human error rate on the widely used NIST 2000 test set for commercial bulk transcription. The error rate of professional transcribers is 5.9% for the Switchboard portion of the data, in which newly acquainted pairs of people discus... |
Year | DOI | Venue |
---|---|---|
2017 | 10.1109/TASLP.2017.2756440 | IEEE/ACM Transactions on Audio, Speech, and Language Processing |
Keywords | Field | DocType |
Speech recognition,Error analysis,Spatial analysis,Recurrent neural networks,NIST,Acoustics | Transcription (linguistics),Computer science,Word error rate,Recurrent neural network,Speech recognition,Smoothing,Discriminative model,Language model,Acoustic model,Test set | Journal |
Volume | Issue | ISSN |
25 | 12 | 2329-9290 |
Citations | PageRank | References |
11 | 0.56 | 41 |
Authors | ||
8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Wayne Xiong | 1 | 22 | 1.86 |
Jasha Droppo | 2 | 861 | 68.35 |
Xuedong Huang | 3 | 1390 | 283.19 |
frank seide | 4 | 1489 | 101.15 |
Michael L. Seltzer | 5 | 1027 | 69.42 |
Andreas Stolcke | 6 | 6690 | 712.46 |
Dong Yu | 7 | 6264 | 475.73 |
Geoffrey Zweig | 8 | 3406 | 320.25 |