An Audio Scene Classification Framework With Embedded Filters And A Dct-Based Temporal Module - Citegraph

Paper Info

Title
An Audio Scene Classification Framework With Embedded Filters And A Dct-Based Temporal Module

Abstract
Deep convolutional neural network (DCNN) has recently improved the performance of acoustic scene classification. However, the input features of the network are usually based on predefined hand-tailored filters, which may not apply to the specific tasks. To overcome this, we propose a hybrid framework that jointly trains the front-end filters and the back-end DCNN. Also, a novel temporal module based on the discrete cosine transform (DCT) is inserted after the high-level feature map of the network, thus enabling us to utilize time information without a reduction of training samples. Our single system, composed of the fine-tuned wavelet front-end and the DCNN back-end, with the integrated DCT-based temporal module, has achieved an accuracy of 79:20% in the evaluation set in DCASE17, gaining around 3% and 8% accuracy improvement compared with scalogram-DCNN and FBank-DCNN systems, respectively.

Year	DOI	Venue
2019	10.1109/icassp.2019.8683636	2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
Keywords	Field	DocType
Acoustic scene classification, embedded filters, joint-training, DCT-based temporal module	Pattern recognition,Computer science,Convolutional neural network,Discrete cosine transform,Artificial intelligence,Wavelet	Conference
ISSN	Citations	PageRank
1520-6149	0	0.34
References	Authors
0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Hangting Chen	1	2	2.39
Pengyuan Zhang	2	50	19.46
Yonghong Yan	3	10	6.40

1