Title
Duration-Controlled LSTM for Polyphonic Sound Event Detection.
Abstract
This paper presents a new hybrid approach called duration-controlled long short-term memory (LSTM) for polyphonic sound event detection (SED). It builds upon a state-of-the-art SED method that performs frame-by-frame detection using a bidirectional LSTM recurrent neural network (BLSTM), and incorporates a duration-controlled modeling technique based on a hidden semi-Markov model. The proposed appr...
Year
DOI
Venue
2017
10.1109/TASLP.2017.2740002
IEEE/ACM Transactions on Audio, Speech, and Language Processing
Keywords
Field
DocType
Hidden Markov models,Logic gates,Speech,Recurrent neural networks,Speech processing,Computer architecture,Event detection
Speech processing,F1 score,Pattern recognition,Computer science,Voice activity detection,Word error rate,Recurrent neural network,Speech recognition,Non-negative matrix factorization,Artificial intelligence,Thresholding,Hidden Markov model
Journal
Volume
Issue
ISSN
25
11
2329-9290
Citations 
PageRank 
References 
11
0.67
23
Authors
6
Name
Order
Citations
PageRank
Tomoki Hayashi19618.49
Shinji Watanabe21158139.38
Tomoki Toda31874167.18
Takaaki Hori440845.58
Jonathan Le Roux583968.14
Kazuya Takeda61301195.60