Title
Content-oriented multimedia document understanding through cross-media correlation.
Abstract
This paper presents a novel method for multimedia document content analysis through modeling multimodal data correlations. We hypothesize that the correlation of different modalities from the same data source can help achieve better multimedia content understanding results than one which explores a single modality. We turn this task into two parts: multimedia data fusion and multimodal correlation propagation. During the first stage, we re-organize the training multimedia data into Modality semAntic Documents (MADs) after extracting quantized multimodal features, and then use multivariate Gaussian distributions to characterize the continuous quantity by latent topic modeling. Model parameters are asymmetrically learned to initialize multimodal correlations in the latent topic space. Accordingly, during the second stage, we construct a Multimodal Correlation Network (MCN) based on the initialized multimodal correlations, and a new mechanism of propagating inter-modality correlations and intra-modality similarities in MCN is further proposed to take the complementary from cross-modalities to facilitate multimedia content analysis. The experimental results of image-audio data retrieval on a 10-categories dataset and content-oriented web page recommendation on the USTODAY dataset show the effectiveness of our method.
Year
DOI
Venue
2015
10.1007/s11042-014-2044-9
Multimedia Tools and Applications
Keywords
DocType
Volume
Multimedia documents, Multimodal, MAD, MCN, Correlation propagation
Journal
74
Issue
ISSN
Citations 
18
1573-7721
5
PageRank 
References 
Authors
0.38
43
5
Name
Order
Citations
PageRank
tong lu137267.17
Yukang Jin250.38
Feng Su317018.63
Palaiahnakote Shivakumara477464.90
Chew Lim Tan54484284.26