Isolated Word Recognition with Audio Derivation and CNN - Citegraph

Paper Info

Title
Isolated Word Recognition with Audio Derivation and CNN

Abstract
We present a speaker-independent isolated word recognition approach with audio derivation and convolutional neural network(CNN) in this paper. In contrast with traditional sophisticated phonetic-based features extracted from audio, we utilize the spectrogram of audio as training data for convolutional neural network which transforms the isolated word recognition problem into the image recognition problem. Deep learning has high demands of training data, but it will reduce efficiency of the system to make such corpora. We present an audio-level data derivation approach, which makes it possible to obtain high recognition rate with a small number of audio seed data collected. It is achieved by formant perturbation, pitch shifting, time stretching and volume perturbation while maintaining semantic content. The approach presented in this paper reduces seed data amount demand of deep learning in isolated word recognition. Results show that accuracy improvement is significant with derived data and only 7.57%-15.14% of seed data is needed to achieve the same level accuracy.

Year	DOI	Venue
2017	10.1109/ICTAI.2017.00060	2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI)
Keywords	Field	DocType
convolutional neuralnetwork,audio derivation,limited training set,spectrogram,isolated word recognition	Audio time-scale/pitch modification,Pattern recognition,Computer science,Spectrogram,Convolutional neural network,Word recognition,Feature extraction,Artificial intelligence,Deep learning,Formant,Hidden Markov model	Conference
ISSN	ISBN	Citations
1082-3409	978-1-5386-3877-4	0
PageRank	References	Authors
0.34	0	4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
jingjing zhang	1	139	19.09
Shuangjiu Xiao	2	41	14.18
Huichao Zhang	3	0	0.34
Lan Jiang	4	0	0.34

1