Title
A Computation-Efficient Neural Network for VAD using Multi-Channel Feature
Abstract
Voice activity detection serves as an essential pre-processor in modern speech processing systems. It classifies audio segments into speech and nonspeech. Many state-of-the-art meth-ods have been proposed to increase the detection accuracy. How-ever, there are still significant limitations to retaining high per-formance while keeping low computation complexity, especially in handling unseen noises. This paper proposes a computation-efficient neural network using a multi-channel audio feature. The audio feature is contextual-aware with positional information and is represented in a three-channel way, similar to RGB pictures, which enables convolutional kernels to capture more information simultaneously. Meanwhile, we introduce channel attention inverted blocks to build a computation-efficient neural network. Our proposed method shows superior performance with extremely few floating point operations as compared with baseline methods.
Year
Venue
Keywords
2022
2022 30th European Signal Processing Conference (EUSIPCO)
voice activity detection,channel attention,computation-efficient,deep neural network
DocType
ISSN
ISBN
Conference
2219-5491
978-1-6654-6799-5
Citations 
PageRank 
References 
0
0.34
11
Authors
3
Name
Order
Citations
PageRank
Runze Wang100.34
Iman Moazzen200.34
Wei-Ping Zhu301.01