Title
MVANet: Multi-Task Guided Multi-View Attention Network for Chinese Food Recognition
Abstract
Food recognition plays a much critical role in various health-care applications. However, it poses many challenges to current approaches due to the diverse appearances of food dishes and the non-uniform composition of ingredients for the foods in the same category. Current methods primarily focus on the appearance of foods without considering their semantic information, easily finding the wrong attention areas of food images. Second, these methods lack the dynamic weighting of multiple semantic features in the modeling process. Thus this paper proposes a novel Multi-View Attention Network within the multi-task learning framework that incorporates multiple semantic features into the food recognition task from both ingredient recognition and recipe modeling. It also utilizes the multi-view attention mechanism to automatically adjust the weights of different semantic features and enables different tasks to interact with each other so as to obtain a more comprehensive feature representation. The experiments conducted on both ChineseFoodNet and VIREO Food-172 benchmark databases validate the proposed method with the obvious improvement of the performance and the lower parameter size.
Year
DOI
Venue
2021
10.1109/TMM.2020.3028478
IEEE TRANSACTIONS ON MULTIMEDIA
Keywords
DocType
Volume
Task analysis, Semantics, Feature extraction, Image recognition, Deep learning, Shape, Fuses, Food recognition, convolutional neural network, multi-task learning, multi-view attention
Journal
23
ISSN
Citations 
PageRank 
1520-9210
0
0.34
References 
Authors
0
6
Name
Order
Citations
PageRank
Haozan Liang100.34
Guihua Wen2168.69
Yang Hu301.35
Mingnan Luo400.34
Pei Yang55817.32
Yingxue Xu601.69