Abstract | ||
---|---|---|
This paper presents a new hybrid approach called duration-controlled long short-term memory (LSTM) for polyphonic sound event detection (SED). It builds upon a state-of-the-art SED method that performs frame-by-frame detection using a bidirectional LSTM recurrent neural network (BLSTM), and incorporates a duration-controlled modeling technique based on a hidden semi-Markov model. The proposed appr... |
Year | DOI | Venue |
---|---|---|
2017 | 10.1109/TASLP.2017.2740002 | IEEE/ACM Transactions on Audio, Speech, and Language Processing |
Keywords | Field | DocType |
Hidden Markov models,Logic gates,Speech,Recurrent neural networks,Speech processing,Computer architecture,Event detection | Speech processing,F1 score,Pattern recognition,Computer science,Voice activity detection,Word error rate,Recurrent neural network,Speech recognition,Non-negative matrix factorization,Artificial intelligence,Thresholding,Hidden Markov model | Journal |
Volume | Issue | ISSN |
25 | 11 | 2329-9290 |
Citations | PageRank | References |
11 | 0.67 | 23 |
Authors | ||
6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Tomoki Hayashi | 1 | 96 | 18.49 |
Shinji Watanabe | 2 | 1158 | 139.38 |
Tomoki Toda | 3 | 1874 | 167.18 |
Takaaki Hori | 4 | 408 | 45.58 |
Jonathan Le Roux | 5 | 839 | 68.14 |
Kazuya Takeda | 6 | 1301 | 195.60 |