Rethinking Evaluation in ASR - Are Our Models Robust Enough? - Citegraph

Paper Info

Title
Rethinking Evaluation in ASR - Are Our Models Robust Enough?

Abstract
Is pushing numbers on a single benchmark valuable in automatic speech recognition? Research results in acoustic modeling are typically evaluated based on performance on a single dataset. While the research community has coalesced around various benchmarks, we set out to understand generalization performance in acoustic modeling across datasets -- in particular, if models trained on a single dataset transfer to other (possibly out-of-domain) datasets. Further, we demonstrate that when a large enough set of benchmarks is used, average word error rate (WER) performance over them provides a good proxy for performance on real-world data. Finally, we show that training a single acoustic model on the most widely-used datasets -- combined -- reaches competitive performance on both research and real-world benchmarks.

Year	DOI	Venue
2021	10.21437/Interspeech.2021-1758	Interspeech
DocType	Citations	PageRank
Conference	2	0.38
References	Authors
0	8

Authors (8 rows)

Cited by (2 rows)

References (0 rows)

Name	Order	Citations	PageRank
Tatiana Likhomanenko	1	24	5.47
Qiantong Xu	2	34	7.42
Vineel Pratap	3	16	2.69
Paden Tomasello	4	3	1.42
Jacob Kahn	5	20	2.38
Gilad Avidov	6	2	0.38
Ronan Collobert	7	4002	308.61
Gabriel Synnaeve	8	27	7.73

1