AudioPairBank: towards a large-scale tag-pair-based audio content analysis - Citegraph

Paper Info

Title
AudioPairBank: towards a large-scale tag-pair-based audio content analysis

Abstract
Recently, sound recognition has been used to identify sounds, such as the sound of a car, or a river. However, sounds have nuances that may be better described by adjective-noun pairs such as “slow car” and verb-noun pairs such as “flying insects,” which are underexplored. Therefore, this work investigates the relationship between audio content and both adjective-noun pairs and verb-noun pairs. Due to the lack of datasets with these kinds of annotations, we collected and processed the AudioPairBank corpus consisting of a combined total of 1123 pairs and over 33,000 audio files. In this paper, we include previously unavailable documentation of the challenges and implications of collecting audio recordings with these types of labels. We have also shown the degree of correlation between the audio content and the labels through classification experiments, which yielded 70% accuracy. The results and study in this paper encourage further exploration of the nuances in sounds and are meant to complement similar research performed on images and text in multimedia analysis.

Year	DOI	Venue
2018	10.1186/s13636-018-0137-5	EURASIP Journal on Audio, Speech, and Music Processing
Keywords	Field	DocType
Sound event database,Audio content analysis,Machine learning,Signal processing	Sound recognition,Audio content analysis,Computer science,Speech recognition,Documentation	Journal
Volume	Issue	ISSN
2018	1	1687-4722
Citations	PageRank	References
2	0.38	18
Authors
6

Authors (6 rows)

Cited by (2 rows)

References (18 rows)

Name	Order	Citations	PageRank
Sebastian Säger	1	2	0.38
Benjamin Elizalde	2	359	22.38
Damian Borth	3	764	49.45
Christian Schulze	4	2	0.38
Raj, Bhiksha	5	2094	204.63
Ian R. Lane	6	259	33.64

1