Espnet-Se: End-To-End Speech Enhancement And Separation Toolkit Designed For Asr Integration - Citegraph

Paper Info

Title
Espnet-Se: End-To-End Speech Enhancement And Separation Toolkit Designed For Asr Integration

Abstract
We present ESPnet-SE, which is designed for the quick development of speech enhancement and speech separation systems in a single framework, along with the optional downstream speech recognition module. ESPnet-SE is a new project which integrates rich automatic speech recognition related models, resources and systems to support and validate the proposed front-end implementation (i.e. speech enhancement and separation). It is capable of processing both single-channel and multi-channel data, with various functionalities including dereverberation, denoising and source separation. We provide all-in-one recipes including data pre-processing, feature extraction, training and evaluation pipelines for a wide range of benchmark datasets. This paper describes the design of the toolkit, several important functionalities, especially the speech recognition integration, which differentiates ESPnet-SE from other open source toolkits, and experimental results with major benchmark datasets.

Year	DOI	Venue
2021	10.1109/SLT48900.2021.9383615	2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT)
Keywords	DocType	ISSN
Open-source, end-to-end, speech enhancement, source separation, speech recognition	Conference	2639-5479
Citations	PageRank	References
0	0.34	0
Authors
11

Authors (11 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Chenda Li	1	4	3.83
Jing Shi	2	5	5.80
Wangyou Zhang	3	12	5.44
S. Aswin Shanmugam	4	7	4.21
Xuankai Chang	5	0	0.68
Naoyuki Kamo	6	0	0.68
Moto Hira	7	0	0.34
Tomoki Hayashi	8	96	18.49
Boeddeker Christoph	9	3	3.84
Zhuo Chen	10	153	24.33
Shinji Watanabe	11	1158	139.38

1