Title
Visual Attention Consistency Under Image Transforms for Multi-Label Image Classification
Abstract
Human visual perception shows good consistency for many multi-label image classification tasks under certain spatial transforms, such as scaling, rotation, flipping and translation. This has motivated the data augmentation strategy widely used in CNN classifier training -- transformed images are included for training by assuming the same class labels as their original images. In this paper, we further propose the assumption of perceptual consistency of visual attention regions for classification under such transforms, i.e., the attention region for a classification follows the same transform if the input image is spatially transformed. While the attention regions of CNN classifiers can be derived as an attention heatmap in middle layers of the network, we find that their consistency under many transforms are not preserved. To address this problem, we propose a two-branch network with an original image and its transformed image as inputs and introduce a new attention consistency loss that measures the attention heatmap consistency between two branches. This new loss is then combined with multi-label image classification loss for network training. Experiments on three datasets verify the superiority of the proposed network by achieving new state-of-the-art classification performance.
Year
DOI
Venue
2019
10.1109/CVPR.2019.00082
2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Keywords
Field
DocType
Recognition: Detection,Categorization,Retrieval
Computer vision,Pattern recognition,Computer science,Visual attention,Artificial intelligence,Contextual image classification
Conference
ISSN
ISBN
Citations 
1063-6919
978-1-7281-3294-5
8
PageRank 
References 
Authors
0.42
12
5
Name
Order
Citations
PageRank
Hao Guo1194.03
Kang Zheng2427.41
Xiaochuan Fan3525.01
Hongkai Yu45211.49
Song Wang595479.55