A Computation-Efficient Neural Network for VAD using Multi-Channel Feature - Citegraph

Paper Info

Title
A Computation-Efficient Neural Network for VAD using Multi-Channel Feature

Abstract
Voice activity detection serves as an essential pre-processor in modern speech processing systems. It classifies audio segments into speech and nonspeech. Many state-of-the-art meth-ods have been proposed to increase the detection accuracy. How-ever, there are still significant limitations to retaining high per-formance while keeping low computation complexity, especially in handling unseen noises. This paper proposes a computation-efficient neural network using a multi-channel audio feature. The audio feature is contextual-aware with positional information and is represented in a three-channel way, similar to RGB pictures, which enables convolutional kernels to capture more information simultaneously. Meanwhile, we introduce channel attention inverted blocks to build a computation-efficient neural network. Our proposed method shows superior performance with extremely few floating point operations as compared with baseline methods.

Year	Venue	Keywords
2022	2022 30th European Signal Processing Conference (EUSIPCO)	voice activity detection,channel attention,computation-efficient,deep neural network
DocType	ISSN	ISBN
Conference	2219-5491	978-1-6654-6799-5
Citations	PageRank	References
0	0.34	11
Authors
3

Authors (3 rows)

Cited by (0 rows)

References (11 rows)

Name	Order	Citations	PageRank
Runze Wang	1	0	0.34
Iman Moazzen	2	0	0.34
Wei-Ping Zhu	3	0	1.01

1