Title
Improvement of Speech/Music Classification for 3GPP EVS Based on LSTM.
Abstract
The competition of speech recognition technology related to smartphones is now getting into full swing with the widespread internet of thing (IoT) devices. For robust speech recognition, it is necessary to detect speech signals in various acoustic environments. Speech/music classification that facilitates optimized signal processing from classification results has been extensively adapted as an essential part of various electronics applications, such as multi-rate audio codecs, automatic speech recognition, and multimedia document indexing. In this paper, we propose a new technique to improve robustness of a speech/music classifier for an enhanced voice service (EVS) codec adopted as a voice-over-LTE (VoLTE) speech codec using long short-term memory (LSTM). For effective speech/music classification, feature vectors implemented with the LSTM are chosen from the features of the EVS. To overcome the diversity of music data, a large scale of data is used for learning. Experiments show that LSTM-based speech/music classification provides better results than the conventional EVS speech/music classification algorithm in various conditions and types of speech/music data, especially at lower signal-to-noise ratio (SNR) than conventional EVS algorithm.
Year
DOI
Venue
2018
10.3390/sym10110605
SYMMETRY-BASEL
Keywords
Field
DocType
speech/music classification,Enhanced Voice Service,long short-term memory,big data
Combinatorics,Speech recognition,Mathematics
Journal
Volume
Issue
ISSN
10
11
2073-8994
Citations 
PageRank 
References 
0
0.34
6
Authors
2
Name
Order
Citations
PageRank
Sang-Ick Kang1254.81
Sangmin Lee245.58