Title
A TWO-STAGE APPROACH TO DEVICE-ROBUST ACOUSTIC SCENE CLASSIFICATION
Abstract
To improve device robustness, a highly desirable key feature of a competitive data-driven acoustic scene classification (ASC) system, a novel two-stage system based on fully convolutional neural networks (CNNs) is proposed. Our two-stage system leverages on an ad-hoc score combination based on two CNN classifiers: (i) the first CNN classifies acoustic inputs into one of three broad classes, and (ii) the second CNN classifies the same inputs into one of ten finer-grained classes. Three different CNN architectures are explored to implement the two-stage classifiers, and a frequency sub-sampling scheme is investigated. Moreover, novel data augmentation schemes for ASC are also investigated. Evaluated on DCASE 2020 Task 1a, our results show that the proposed ASC system attains a state-of-the-art accuracy on the development set, where our best system, a two-stage fusion of CNN ensembles, delivers a 81.9% average accuracy among multi-device test data, and it obtains a significant improvement on unseen devices. Finally, neural saliency analysis with class activation mapping (CAM) gives new insights on the patterns learnt by our models.
Year
DOI
Venue
2021
10.1109/ICASSP39728.2021.9414835
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021)
Keywords
DocType
Citations 
Acoustic scene classification, robustness, convolutional neural networks, data augmentation, class activation mapping
Conference
0
PageRank 
References 
Authors
0.34
7
16
Name
Order
Citations
PageRank
Hu Hu120.69
Chao-Han Huck Yang201.69
Xianjun Xia3123.02
Xue Bai411914.75
Xin Tang500.68
Yajian Wang600.34
Shutong Niu700.68
Li Chai88022.25
Juanjuan Li900.34
Hongning Zhu1000.34
Feng Bao1100.34
Yuanjun Zhao12146.42
Sabato Marco Siniscalchi1331030.21
Yannan Wang1475.71
Qing-Feng LIU1517222.25
Chin-Hui Lee166101852.71