Title
Analysis and classification of acoustic scenes with wavelet transform-based mel-scaled features
Abstract
Analysis of audio from real-life environments and their categorization into different acoustic scenes can make context-aware devices and applications more efficient. Unlike speech, such signals have overlapping frequency content while spanning a much larger audible frequency range. Also, they are less structured than speech/music signals. Wavelet transform has good time-frequency localization ability owing to its variable-length basis functions. Consequently, it facilitates the extraction of more characteristic information from environmental audio. This paper attempts to classify acoustic scenes by a novel use of wavelet-based mel-scaled features. The design of the proposed framework is based on the experiments conducted on two datasets which have same scene classes but differ with regard to sample length and amount of data (in hours). It outperformed two benchmark systems, one based on mel-frequency cepstral coefficients and Gaussian mixture models and the other based on log mel-band energies and multi-layer perceptron. We also present an investigation on the use of different train and test sample duration for acoustic scene classification.
Year
DOI
Venue
2020
10.1007/s11042-019-08279-5
Multimedia Tools and Applications
Keywords
DocType
Volume
DCASE, Environmental sounds, Haar function, MFCC, SVM
Journal
79
Issue
ISSN
Citations 
11
1380-7501
1
PageRank 
References 
Authors
0.36
0
2
Name
Order
Citations
PageRank
Shefali Waldekar110.36
Goutam Saha225523.17