Title
Sound Event Detection Using Point-Labeled Data
Abstract
Sound Event Detection (SED) in audio scenes is the task that has been studied by an increasing number of researchers. Recent SED systems often use deep learning models. Building these systems typically require a large amount of carefully annotated, strongly labeled data, where the exact time-span of a sound event (e.g. the `dog bark' starts at 1.2 seconds and ends at 2.0 seconds) in an audio scene (a recording of a city park) is indicated. However, manual labeling of sound events with their time boundaries within a recording is very time-consuming. One way to solve the issue is to collect data with weak labels that only contain the names of sound classes present in the audio file, without time boundary information for events in the file. Therefore, weakly-labeled sound event detection has become popular recently. However, there is still a large performance gap between models built on weakly labeled data and ones built on strongly labeled data, especially for predicting time boundaries of sound events. In this work, we introduce a new type of sound event label, which is easier for people to provide than strong labels. We call them `point labels'. To create a point label, a user simply listens to the recording and hits the space bar if they hear a sound event ('dog bark'). This is much easier to do than specifying exact time boundaries. In this work, we illustrate methods to train a SED model on point-labeled data. Our results show that a model trained on point labeled audio data significantly outperforms weak models and is comparable to a model trained on strongly labeled data.
Year
DOI
Venue
2019
10.1109/WASPAA.2019.8937213
2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
Keywords
Field
DocType
Sound event detection,Point labels,Weak labels,Deep learning
Computer science,Speech recognition,Artificial intelligence,Deep learning,Labeled data,Acoustics,Sound event detection,Performance gap
Conference
ISSN
ISBN
Citations 
1931-1168
978-1-7281-1124-7
0
PageRank 
References 
Authors
0.34
9
2
Name
Order
Citations
PageRank
Bongjun Kim100.34
Bryan Pardo283063.92