Title
AENet: Learning Deep Audio Features for Video Analysis.
Abstract
We propose a new deep network for audio event recognition, called AENet. In contrast to speech, sounds coming from audio events may be produced by a wide variety of sources. Furthermore, distinguishing them often requires analyzing an extended time period due to the lack of clear subword units that are present in speech. In order to incorporate this long-time frequency structure of audio events, w...
Year
DOI
Venue
2018
10.1109/TMM.2017.2751969
IEEE Transactions on Multimedia
Keywords
DocType
Volume
Feature extraction,Hidden Markov models,Mel frequency cepstral coefficient,Visualization,Speech,Network architecture
Journal
20
Issue
ISSN
Citations 
3
1520-9210
18
PageRank 
References 
Authors
0.91
7
3
Name
Order
Citations
PageRank
Naoya Takahashi1349.44
Michael Gygli223214.18
Luc Van Gool3275661819.51