Topic-Grained Text Representation-Based Model for Document Retrieval. - Citegraph

Paper Info

Title
Topic-Grained Text Representation-Based Model for Document Retrieval.

Abstract
Document retrieval enables users to find their required documents accurately and quickly. To satisfy the requirement of retrieval efficiency, prevalent deep neural methods adopt a representation-based matching paradigm, which saves online matching time by pre-storing document representations offline. However, the above paradigm consumes vast local storage space, especially when storing the document as word-grained representations. To tackle this, we present TGTR, a Topic-Grained Text Representation-based Model for document retrieval. Following the representation-based matching paradigm, TGTR stores the document representations offline to ensure retrieval efficiency, whereas it significantly reduces the storage requirements by using novel topicgrained representations rather than traditional word-grained. Experimental results demonstrate that compared to word-grained baselines, TGTR is consistently competitive with them on TREC CAR and MS MARCO in terms of retrieval accuracy, but it requires less than 1/10 of the storage space required by them. Moreover, TGTR overwhelmingly surpasses global-grained baselines in terms of retrieval accuracy.

Year	DOI	Venue
2022	10.1007/978-3-031-15934-3_64	International Conference on Artificial Neural Networks and Machine Learning (ICANN)
DocType	Citations	PageRank
Conference	0	0.34
References	Authors
0	8

Authors (8 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Mengxue Du	1	0	0.34
Shasha Li	2	2	4.09
Jie Yu	3	41	10.55
Jun Ma	4	32	14.39
Bin Ji	5	0	2.03
Huijun Liu	6	0	1.69
Wuhang Lin	7	0	0.34
Zibo Yi	8	0	0.68

1