Title
Memory Architectures in Recurrent Neural Network Language Models
Abstract
We compare and analyze sequential, random access, and stack memory architectures for recurrent neural network language models. Our experiments on the Penn Treebank and Wikitext-2 datasets show that stack-based memory architectures consistently achieve the best performance in terms of held out perplexity. We also propose a generalization to existing continuous stack models (Joulin u0026 Mikolov,2015; Grefenstette et al., 2015) to allow a variable number of pop operations more naturally that further improves performance. We further evaluate these language models in terms of their ability to capture non-local syntactic dependencies on a subject-verb agreement dataset (Linzen et al., 2016) and establish new state of the art results using memory augmented language models. Our results demonstrate the value of stack-structured memory for explaining the distribution of words in natural language, in line with linguistic theories claiming a context-free backbone for natural language.
Year
Venue
Field
2018
international conference on learning representations
Perplexity,Recurrent neural network language models,Computer science,Natural language,Natural language processing,Treebank,Artificial intelligence,Syntax,Language model,Machine learning,Stack-based memory allocation,Random access
DocType
Citations 
PageRank 
Conference
3
0.39
References 
Authors
5
7
Name
Order
Citations
PageRank
Dani Yogatama185542.43
Yishu Miao217811.44
Gábor Melis330.73
Ling Wang488452.37
Adhiguna Kuncoro51818.49
chris dyer65438232.28
Phil Blunsom73130152.18