MSCAN: Multimodal Self-and-Collaborative Attention Network for image aesthetic prediction tasks - Citegraph

Paper Info

Title
MSCAN: Multimodal Self-and-Collaborative Attention Network for image aesthetic prediction tasks

Abstract
With the ever-expanding volume of visual images on the Internet, automatic image aesthetic prediction is becoming more and more important in computer vision field. Considering the image aesthetic assessment is a highly subjective and complex task, some researchers resort to the user comments to aid aesthetic prediction. However, these methods only achieve limited success because 1) they rely heavily on convolution to extract visual features, which is difficult to capture the spatial interaction of visual elements in image composition; 2) they treat the image features extraction and textual feature extraction as two distinct tasks and ignore the inter-relationships between these two features. We address these challenges by proposing a Multimodal Self-and-Collaborative Attention Network (MSCAN). More specifically, the self-attention module calculates the response at a position by attending to all positions in the images, thus it can effectively encode spatial interaction of the visual elements. To model the complex image-textual feature relations, a co-attention module is used to jointly perform the textual-guided visual attention and visual-guided textual attention. Then the attended multimodal features are aggregated and sent into a two-layer MLP to obtain the aesthetic values. Extensive experiments over two large benchmarks demonstrate that the proposed MSCAN outperforms the state-of-the-arts by a large margin for unified aesthetic prediction tasks.

Year	DOI	Venue
2021	10.1016/j.neucom.2020.10.046	Neurocomputing
Keywords	DocType	Volume
Photo aesthetic assessment,Multimodal learning,Self-attention mechanism,Co-attention mechanism	Journal	430
ISSN	Citations	PageRank
0925-2312	0	0.34
References	Authors
0	4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Xiaodan Zhang	1	2	3.41
Xinbo Gao	2	5534	344.56
Lihuo He	3	179	19.11
Wen Lu	4	25	3.35

1