Abstract | ||
---|---|---|
When pre-trained on large unsupervised textual corpora, language models are able to store and retrieve factual knowledge to some extent, making it possible to use them directly for zero-shot cloze-style question answering. However, storing factual knowledge in a fixed number of weights of a language model clearly has limitations. Previous approaches have successfully provided access to information outside the model weights using supervised architectures that combine an information retrieval system with a machine reading component. In this paper, we go a step further and integrate information from a retrieval system with a pre-trained language model in a purely unsupervised way. We report that augmenting pre-trained language models in this way dramatically improves performance and that the resulting system, despite being unsupervised, is competitive with a supervised machine reading baseline. Furthermore, processing query and context with different segment tokens allows BERT to utilize its Next Sentence Prediction pre-trained classifier to determine whether the context is relevant or not, substantially improving BERT's zero-shot cloze-style question-answering performance and making its predictions robust to noisy contexts. |
Year | DOI | Venue |
---|---|---|
2020 | 10.24432/C5201W | AKBC |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
0 | 7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Fabio Petroni | 1 | 59 | 11.09 |
Patrick Lewis | 2 | 8 | 4.26 |
Aleksandra Piktus | 3 | 0 | 2.37 |
Tim Rocktäschel | 4 | 516 | 34.81 |
Yuxiang Wu | 5 | 18 | 5.07 |
Alexander H. Miller | 6 | 221 | 9.83 |
Sebastian Riedel | 7 | 1625 | 103.73 |