Cptnn: Cross-Parallel Transformer Neural Network For Time-Domain Speech Enhancement - Citegraph

Paper Info

Title
Cptnn: Cross-Parallel Transformer Neural Network For Time-Domain Speech Enhancement

Abstract
In this paper, we propose a novel cross-parallel transformer neural network (CPTNN) for end-to-end speech enhancement in the time domain. The new structure is comprised of an encoder, a cross-parallel transformer module (CPTM), a masking module and a decoder. The encoder first maps the input waveform of noisy speech into feature representations. The CPTM consists of four residually connected cross-parallel transformer blocks, each utilizing local and global transformers to simultaneously extract local and global features which are then fused by a cross-attention based transformer to obtain a better contextual feature representation. The masking module generates a mask to multiply with encoder output, producing the masked encoder features which will be finally used for reconstructing the enhanced speech by the decoder. Experiments are undertaken on the benchmark dataset, indicating that our CPTNN achieves a better performance than state-of-the-art methods in terms of most evaluation criteria while maintaining the lowest model parameters.

Year	DOI	Venue
2022	10.1109/IWAENC53105.2022.9914777	2022 International Workshop on Acoustic Signal Enhancement (IWAENC)
Keywords	DocType	ISBN
Cross-parallel transformer,local and global information,cross-attention,low model complexity,speech enhancement	Conference	978-1-6654-6868-8
Citations	PageRank	References
0	0.34	9
Authors
3

Authors (3 rows)

Cited by (0 rows)

References (9 rows)

Name	Order	Citations	PageRank
Kai Wang	1	0	1.01
Bengbeng He	2	0	1.01
Wei-Ping Zhu	3	0	1.01

1