Title | ||
---|---|---|
Hyperlink-induced Pre-training for Passage Retrieval in Open-domain Question Answering |
Abstract | ||
---|---|---|
To alleviate the data scarcity problem in training question answering systems, recent works propose additional intermediate pre-training for dense passage retrieval (DPR). However, there still remains a large discrepancy between the provided upstream signals and the downstream question-passage relevance, which leads to less improvement. To bridge this gap, we propose the HyperLink-induced Pre-training (HLP), a method to pre-train the dense retriever with the text relevance induced by hyperlink-based topology within Web documents. We demonstrate that the hyperlink-based structures of dual-link and co-mention can provide effective relevance signals for large-scale pre-training that better facilitate downstream passage retrieval. We investigate the effectiveness of our approach across a wide range of open-domain QA datasets under zero-shot, few-shot, multihop, and out-of-domain scenarios. The experiments show our HLP outperforms the BM25 by up to 7 points as well as other pre-training methods by more than 10 points in terms of top-20 retrieval accuracy under the zero-shot scenario. Furthermore, HLP significantly outperforms other pre-training methods under the other scenarios. |
Year | DOI | Venue |
---|---|---|
2022 | 10.18653/v1/2022.acl-long.493 | PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS) |
DocType | Volume | Citations |
Conference | Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) | 0 |
PageRank | References | Authors |
0.34 | 0 | 13 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jiawei Zhou | 1 | 0 | 0.68 |
Xiaoguang Li | 2 | 141 | 19.54 |
Lifeng Shang | 3 | 485 | 30.96 |
Lan Luo | 4 | 1 | 0.69 |
Ke Zhan | 5 | 1 | 1.03 |
Enrui Hu | 6 | 1 | 1.37 |
Xinyu Zhang | 7 | 0 | 0.34 |
Hao Jiang | 8 | 0 | 0.34 |
Zhao Cao | 9 | 6 | 3.85 |
Fan Yu | 10 | 1 | 1.03 |
Xin Jiang | 11 | 150 | 32.43 |
Qun Liu | 12 | 2149 | 203.11 |
Lei Chen | 13 | 6239 | 395.84 |