Title
Convolutional maxout neural networks for speech separation.
Abstract
Speech separation based on deep neural networks (DNNs) has been widely studied recently, and has achieved considerable success. However, previous studies are mostly based on fully-connected neural networks. In order to capture the local information of speech signals, we propose to use convolutional maxout neural networks (CMNNs) to separate speech and noise by estimating the ideal ratio mask of the time-frequency units. In our work the proposed CMNN is applied in the frequency domain. By using local filtering and max-pooling, convolutional neural networks can model the local structure of speech signals. Instead of sigmoid function, maxout is selected to address the saturation problem. In addition, dropout is integrated into the network to get better generalization ability. The proposed system outperforms a traditional DNN-based system in both objective speech quality and intelligibility.
Year
DOI
Venue
2015
10.1109/ISSPIT.2015.7394335
ISSPIT
Keywords
Field
DocType
convolutional maxout neural network,speech separation,deep neural network,local information capture,time-frequency unit,local filtering,max-pooling,objective speech quality,objective speech intelligibility
Speech processing,Pattern recognition,Computer science,Voice activity detection,Convolutional neural network,Speech recognition,Time delay neural network,Artificial intelligence,Deep learning,Artificial neural network,Acoustic model,Intelligibility (communication)
Conference
Citations 
PageRank 
References 
3
0.47
13
Authors
6
Name
Order
Citations
PageRank
Like Hui182.92
Meng Cai2688.24
Cong Guo330.47
Liang He46717.35
Wei-Qiang Zhang513631.22
Jia Liu627750.34