Text-to-Image Generation Grounded by Fine-Grained User Attention - Citegraph

Paper Info

Title
Text-to-Image Generation Grounded by Fine-Grained User Attention

Abstract
Localized Narratives [28] is a dataset with detailed natural language descriptions of images paired with mouse traces that provide a sparse, fine-grained visual grounding for phrases. We propose TRECS, a sequential model that exploits this grounding to generate images. TRECS uses descriptions to retrieve segmentation masks and predict object labels aligned with mouse traces. These alignments are u...

Year	DOI	Venue
2021	10.1109/WACV48630.2021.00028	2021 IEEE Winter Conference on Applications of Computer Vision (WACV)
Keywords	DocType	ISSN
Measurement,Image segmentation,Visualization,Computer vision,Grounding,Conferences,Natural languages	Conference	2472-6737
ISBN	Citations	PageRank
978-1-6654-0477-8	0	0.34
References	Authors
5	4

Authors (4 rows)

Cited by (0 rows)

References (5 rows)

Name	Order	Citations	PageRank
Jing Yu Koh	1	0	0.34
Jason Baldridge	2	933	69.95
Honglak Lee	3	6247	398.39
Yinfei Yang	4	0	0.34

1