Acoustic scene classification using sparse feature learning and event-based pooling - Citegraph

Paper Info

Title
Acoustic scene classification using sparse feature learning and event-based pooling

Abstract
Recently unsupervised learning algorithms have been successfully used to represent data in many of machine recognition tasks. In particular, sparse feature learning algorithms have shown that they can not only discover meaningful structures from raw data but also outperform many hand-engineered features. In this paper, we apply the sparse feature learning approach to acoustic scene classification. We use a sparse restricted Boltzmann machine to capture manyfold local acoustic structures from audio data and represent the data in a high-dimensional sparse feature space given the learned structures. For scene classification, we summarize the local features by pooling over audio scene data. While the feature pooling is typically performed over uniformly divided segments, we suggest a new pooling method, which first detects audio events and then performs pooling only over detected events, considering the irregular occurrence of audio events in acoustic scene data. We evaluate the learned features on the IEEE AASP Challenge development set, comparing them with a baseline model using mel-frequency cepstral coefficients (MFCCs). The results show that learned features outperform MFCCs, event-based pooling achieves higher accuracy than uniform pooling and, furthermore, a combination of the two methods performs even better than either one used alone.

Year	DOI	Venue
2013	10.1109/WASPAA.2013.6701893	WASPAA
Keywords	Field	DocType
restricted boltzmann machine,sparse restricted boltzmann machine,event detection,sparse feature learning algorithm,boltzmann machines,environmental sound,learning (artificial intelligence),unsupervised learning algorithm,feature learning,mel-frequency cepstral coefficients,cepstral analysis,acoustic scene classification,max-pooling,acoustic signal processing,signal classification,ieee aasp challenge development set,event-based pooling,sparse feature representation,mfcc,machine recognition tasks,learning artificial intelligence	Restricted Boltzmann machine,Mel-frequency cepstrum,Feature vector,Pattern recognition,Computer science,Pooling,Raw data,Speech recognition,Unsupervised learning,Artificial intelligence,Machine recognition,Feature learning	Conference
ISSN	Citations	PageRank
1931-1168	7	0.56
References	Authors
8	3

Authors (3 rows)

Cited by (7 rows)

References (8 rows)

Name	Order	Citations	PageRank
Kyogu Lee	1	263	38.85
Ziwon Hyung	2	27	1.96
Juhan Nam	3	261	25.12

1