Abstract | ||
---|---|---|
In this work we explore recent advances in Recurrent Neural Networks for large scale Language Modeling, a task central to language understanding. We extend current models to deal with two key challenges present in this task: corpora and vocabulary sizes, and complex, long term structure of language. We perform an exhaustive study on techniques such as character Convolutional Neural Networks or Long-Short Term Memory, on the One Billion Word Benchmark. Our best single model significantly improves state-of-the-art perplexity from 51.3 down to 30.0 (whilst reducing the number of parameters by a factor of 20), while an ensemble of models sets a new record by improving perplexity from 41.0 down to 23.7. We also release these models for the NLP and ML community to study and improve upon. |
Year | Venue | Field |
---|---|---|
2016 | arXiv: Computation and Language | Perplexity,Convolutional neural network,Computer science,Recurrent neural network,Artificial intelligence,Natural language processing,Vocabulary,Language understanding,Machine learning,Language model |
DocType | Volume | Citations |
Journal | abs/1602.02410 | 153 |
PageRank | References | Authors |
5.35 | 35 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Rafal Józefowicz | 1 | 1512 | 60.77 |
Oriol Vinyals | 2 | 9419 | 418.45 |
Mike Schuster | 3 | 2303 | 111.71 |
Noam Shazeer | 4 | 1089 | 43.70 |
Yonghui Wu | 5 | 1065 | 72.78 |