AENet: Learning Deep Audio Features for Video Analysis. - Citegraph

Paper Info

Title
AENet: Learning Deep Audio Features for Video Analysis.

Abstract
We propose a new deep network for audio event recognition, called AENet. In contrast to speech, sounds coming from audio events may be produced by a wide variety of sources. Furthermore, distinguishing them often requires analyzing an extended time period due to the lack of clear subword units that are present in speech. In order to incorporate this long-time frequency structure of audio events, w...

Year	DOI	Venue
2018	10.1109/TMM.2017.2751969	IEEE Transactions on Multimedia
Keywords	DocType	Volume
Feature extraction,Hidden Markov models,Mel frequency cepstral coefficient,Visualization,Speech,Network architecture	Journal	20
Issue	ISSN	Citations
3	1520-9210	18
PageRank	References	Authors
0.91	7	3

Authors (3 rows)

Cited by (18 rows)

References (7 rows)

Name	Order	Citations	PageRank
Naoya Takahashi	1	34	9.44
Michael Gygli	2	232	14.18
Luc Van Gool	3	27566	1819.51

1