Abstract | ||
---|---|---|
We compare and analyze sequential, random access, and stack memory architectures for recurrent neural network language models. Our experiments on the Penn Treebank and Wikitext-2 datasets show that stack-based memory architectures consistently achieve the best performance in terms of held out perplexity. We also propose a generalization to existing continuous stack models (Joulin u0026 Mikolov,2015; Grefenstette et al., 2015) to allow a variable number of pop operations more naturally that further improves performance. We further evaluate these language models in terms of their ability to capture non-local syntactic dependencies on a subject-verb agreement dataset (Linzen et al., 2016) and establish new state of the art results using memory augmented language models. Our results demonstrate the value of stack-structured memory for explaining the distribution of words in natural language, in line with linguistic theories claiming a context-free backbone for natural language. |
Year | Venue | Field |
---|---|---|
2018 | international conference on learning representations | Perplexity,Recurrent neural network language models,Computer science,Natural language,Natural language processing,Treebank,Artificial intelligence,Syntax,Language model,Machine learning,Stack-based memory allocation,Random access |
DocType | Citations | PageRank |
Conference | 3 | 0.39 |
References | Authors | |
5 | 7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Dani Yogatama | 1 | 855 | 42.43 |
Yishu Miao | 2 | 178 | 11.44 |
Gábor Melis | 3 | 3 | 0.73 |
Ling Wang | 4 | 884 | 52.37 |
Adhiguna Kuncoro | 5 | 181 | 8.49 |
chris dyer | 6 | 5438 | 232.28 |
Phil Blunsom | 7 | 3130 | 152.18 |