Abstract | ||
---|---|---|
Due to the semantic gap problem across different modalities, automatically retrieval from multimedia information still faces a main challenge. It is desirable to provide an effective joint model to bridge the gap and organize the relationships between them. In this work, we develop a deep image annotation and classification by fusing multi-modal semantic topics (DAC_mmst) model, which has the capacity for finding visual and non-visual topics by jointly modeling the image and loosely related text for deep image annotation while simultaneously learning and predicting the class label. More specifically, DAC_mmst depends on a non-parametric Bayesian model for estimating the best number of visual topics that can perfectly explain the image. To evaluate the effectiveness of our proposed algorithm, we collect a real-world dataset to conduct various experiments. The experimental results show our proposed DAC_mmst performs favorably in perplexity, image annotation and classification accuracy, comparing to several state-of-the-art methods. |
Year | DOI | Venue |
---|---|---|
2018 | 10.3837/tiis.2018.01.019 | KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS |
Keywords | Field | DocType |
multi-modal topic model,image annotation,image classification,nonparametric Bayesian statistics,variational inference algorithm | Automatic image annotation,Information retrieval,Computer science,Modal,Distributed computing | Journal |
Volume | Issue | ISSN |
12 | 1 | 1976-7277 |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yongheng Chen | 1 | 18 | 4.50 |
Fu-Quan Zhang | 2 | 9 | 13.68 |
Wanli Zuo | 3 | 342 | 42.73 |