Title
A Manifold Learning Method to Passage Retrieval for Open-Domain Question Answering
Abstract
Passage retriever plays an important role for obtaining answers in open-domain textual question answering system, which selects candidate contexts from a large collection of documents and feed to the machine reader. Traditional defacto methods usually construct sparse vectors to match the rules of co-occurrence of words between passages and questions, such as TF-IDF or BM25. And some more advanced methods model word-level contextual semantics similarities to match the text. In this work, we presents a method of encoding text by short sliding windows with built-in continuity, and applying manifold learning method on it to model continuous representation of semantics, so as to represent the similarity features at the passage-level and reduce the directional sparsity difference caused by the difference of text length. Compared with the traditional Lucene BM25 system in the top-20 paragraphs retrieval, the accuracy of our method is 5%-16% higher, and the recall rate is 8%-16% higher.
Year
DOI
Venue
2021
10.1109/DSC53577.2021.00052
2021 IEEE Sixth International Conference on Data Science in Cyberspace (DSC)
Keywords
DocType
ISBN
passages,TF-IDF,advanced methods model word-level contextual semantics similarities,short sliding windows,manifold learning method,passage-level,text length,traditional Lucene BM25 system,passage retrieval,open-domain question,open-domain textual question answering system,candidate contexts,machine reader,traditional defacto methods,sparse vectors
Conference
978-1-6654-1816-4
Citations 
PageRank 
References 
0
0.34
12
Authors
3
Name
Order
Citations
PageRank
Ruidong Ding100.34
Bin Zhou234130.99
Hongkui Tu300.34