Title
ChefGAN: Food Image Generation from Recipes
Abstract
Although significant progress has been made in generating images from the text by using generative adversarial networks (GANs), it is still challenging to deal with long text, which contains complex semantic information like recipes. This paper focuses on generating images with high visual realism and semantic consistency from the complex text of recipes. To achieve this, we propose a GANs based method termed ChefGAN. The critical concept of ChefGAN is that a joint image-recipe embedding model is used before the generation task to provide high-quality representations of recipes, and it acts as an extra regularization during the generation to improve semantic consistency. Two modules are designed for this image text embedding module (ITEM) and a cascaded image generation module (CIGM). The generation process is carried out in 3 steps: (1) Two encoders in ITEM are trained simultaneously to generate similar representations for each image-recipe pair. (2) CIGM generates images according to the representations from ITEM's text encoder. (3) The generated image is fed into ITEM's image encoder to calculate the similarity with the given recipe. This process can provide additional regularization effect other than the impact of a discriminator. To facilitate convergence, we applied a two-stage training strategy, which generates an image with low resolution and then one with high resolution in the CIGM module. Compared with other representative state-of-the-art methods, ChefGAN demonstrates better performance both in visual realism and semantic consistency.
Year
DOI
Venue
2020
10.1145/3394171.3413636
MM '20: The 28th ACM International Conference on Multimedia Seattle WA USA October, 2020
DocType
ISBN
Citations 
Conference
978-1-4503-7988-5
1
PageRank 
References 
Authors
0.35
0
5
Name
Order
Citations
PageRank
Siyuan Pan111.36
Ling Dai2152.74
Xuhong Hou3474.03
Huating Li4225.14
Bin Sheng5258.13