Deep Convolutional Neural Networks for Large-scale Speech Tasks. - Citegraph

Paper Info

Title
Deep Convolutional Neural Networks for Large-scale Speech Tasks.

Abstract
Convolutional Neural Networks (CNNs) are an alternative type of neural network that can be used to reduce spectral variations and model spectral correlations which exist in signals. Since speech signals exhibit both of these properties, we hypothesize that CNNs are a more effective model for speech compared to Deep Neural Networks (DNNs). In this paper, we explore applying CNNs to large vocabulary continuous speech recognition (LVCSR) tasks. First, we determine the appropriate architecture to make CNNs effective compared to DNNs for LVCSR tasks. Specifically, we focus on how many convolutional layers are needed, what is an appropriate number of hidden units, what is the best pooling strategy. Second, investigate how to incorporate speaker-adapted features, which cannot directly be modeled by CNNs as they do not obey locality in frequency, into the CNN framework. Third, given the importance of sequence training for speech tasks, we introduce a strategy to use ReLU+dropout during Hessian-free sequence training of CNNs. Experiments on 3 LVCSR tasks indicate that a CNN with the proposed speaker-adapted and ReLU+dropout ideas allow for a 12%–14% relative improvement in WER over a strong DNN system, achieving state-of-the art results in these 3 tasks.

Year	DOI	Venue
2015	10.1016/j.neunet.2014.08.005	Neural Networks
Keywords	Field	DocType
Deep learning,Neural networks,Speech recognition	Locality,Convolutional neural network,Computer science,Pooling,Speech recognition,Artificial intelligence,Deep learning,Artificial neural network,Vocabulary,Machine learning,Deep neural networks	Journal
Volume	Issue	ISSN
64	1	0893-6080
Citations	PageRank	References
89	3.39	26
Authors
7

Authors (7 rows)

Cited by (89 rows)

References (26 rows)

Name	Order	Citations	PageRank
Tara N. Sainath	1	3497	232.43
B. Kingsbury	2	4175	335.43
George Saon	3	825	80.99
Hagen Soltau	4	795	67.33
Abdel-rahman Mohamed	5	3772	266.13
George E. Dahl	6	4734	416.42
Bhuvana Ramabhadran	7	1779	153.83

1