Abstract | ||
---|---|---|
ABSTRACTNeural passage retrieval is a new and promising approach in open retrieval question answering. In this work, we stress-test the Dense Passage Retriever (DPR)---a state-of-the-art (SOTA) open domain neural retrieval model---on closed and specialized target domains such as COVID-19, and find that it lags behind standard BM25 in this important real-world setting. To make DPR more robust under domain shift, we explore its fine-tuning with synthetic training examples, which we generate from unlabeled target domain text using a text-to-text generator. In our experiments, this noisy but fully automated target domain supervision gives DPR a sizable advantage over BM25 in out-of-domain settings, making it a more viable model in practice. Finally, an ensemble of BM25 and our improved DPR model yields the best results, further pushing the SOTA for open retrieval QA on multiple out-of-domain test sets. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1145/3404835.3463085 | Research and Development in Information Retrieval |
Keywords | DocType | Citations |
Open retrieval question answering, Neural passage retrieval, Weak supervision, Out-of-domain neural IR | Conference | 1 |
PageRank | References | Authors |
0.48 | 0 | 8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Revanth Gangi Reddy | 1 | 1 | 1.83 |
Bhavani Iyer | 2 | 1 | 0.82 |
Md. Arafat Sultan | 3 | 85 | 9.26 |
R. Zhang | 4 | 490 | 42.39 |
Avi Sil | 5 | 1 | 1.83 |
Vittorio Castelli | 6 | 928 | 129.71 |
Tahira Naseem | 7 | 1 | 3.19 |
Salim Roukos | 8 | 6248 | 845.50 |