Dual-Path Rnn For Long Recording Speech Separation - Citegraph

Paper Info

Title
Dual-Path Rnn For Long Recording Speech Separation

Abstract
Continuous speech separation (CSS) is an arising task in speech separation aiming at separating overlap-free targets from a long, partially-overlapped recording. A straightforward extension of previously proposed sentence-level separation models to this task is to segment the long recording into fixed-length blocks and perform separation on them independently. However, such simple extension does not fully address the cross-block dependencies and the separation performance may not be satisfactory. In this paper, we focus on how the block-level separation performance can be improved by exploring methods to utilize the cross-block information. Based on the recently proposed dual-path RNN (DPRNN) architecture, we investigate how DPRNN can help the block-level separation by the interleaved intra- and inter-block modules. Experiment results show that DPRNN is able to significantly outperform the baseline block-level model in both offline and block-online configurations under certain settings.

Year	DOI	Venue
2021	10.1109/SLT48900.2021.9383514	2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT)
Keywords	DocType	ISSN
Continuous speech separation, long recording speech separation, dual-path RNN	Conference	2639-5479
Citations	PageRank	References
0	0.34	0
Authors
12

Authors (12 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Chenda Li	1	4	3.83
Yi Luo	2	120	13.05
Cong Han	3	7	4.56
Jinyu Li	4	0	0.34
Takuya Yoshioka	5	585	49.20
Tianyan Zhou	6	12	4.79
Marc Delcroix	7	699	62.07
Keisuke Kinoshita	8	494	54.81
Boeddeker Christoph	9	3	3.84
Yanmin Qian	10	295	44.44
Shinji Watanabe	11	1158	139.38
Zhuo Chen	12	153	24.33

1