Transformer Feed-Forward Layers Are Key-Value Memories. | 0 | 0.34 | 2021 |
Generalization through Memorization: Nearest Neighbor Language Models. | 0 | 0.34 | 2020 |
code2seq: Generating Sequences from Structured Representations of Code. | 0 | 0.34 | 2019 |
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. | 0 | 0.34 | 2019 |
Simulating Action Dynamics with Neural Process Networks. | 0 | 0.34 | 2018 |