Q-Value Weighted Regression: Reinforcement Learning with Limited Data | 0 | 0.34 | 2022 |
Hierarchical Transformers Are More Efficient Language Models. | 0 | 0.34 | 2022 |
Rethinking Attention with Performers | 0 | 0.34 | 2021 |
Sparse is Enough in Scaling Transformers. | 0 | 0.34 | 2021 |
Model Based Reinforcement Learning for Atari | 0 | 0.34 | 2020 |
Reformer: The Efficient Transformer | 4 | 0.39 | 2020 |
Universal Transformers. | 0 | 0.34 | 2019 |
Sample Efficient Text Summarization Using a Single Pre-Trained Transformer. | 2 | 0.36 | 2019 |
Parallel Scheduled Sampling. | 0 | 0.34 | 2019 |
Area Attention. | 0 | 0.34 | 2019 |
Generating Wikipedia by Summarizing Long Sequences. | 18 | 0.62 | 2018 |
Unsupervised Cipher Cracking Using Discrete GANs. | 3 | 0.37 | 2018 |
Discrete Autoencoders for Sequence Models. | 1 | 0.35 | 2018 |
Tensor2Tensor for Neural Machine Translation. | 19 | 0.70 | 2018 |
Image Transformer. | 0 | 0.34 | 2018 |
Fast Decoding in Sequence Models using Discrete Latent Variables. | 11 | 0.57 | 2018 |
Attention Is All You Need. | 432 | 6.52 | 2017 |
Learning to Remember Rare Events. | 3 | 0.38 | 2017 |
Depthwise Separable Convolutions for Neural Machine Translation. | 16 | 0.74 | 2017 |
Regularizing Neural Networks by Penalizing Confident Output Distributions. | 42 | 1.24 | 2017 |
One Model To Learn Them All. | 29 | 0.76 | 2017 |
Machine Learning with Guarantees using Descriptive Complexity and SMT Solvers. | 2 | 0.38 | 2016 |
TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. | 743 | 29.70 | 2016 |
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. | 432 | 14.10 | 2016 |
Can Active Memory Replace Attention? | 10 | 0.60 | 2016 |
Graph Searching Games and Width Measures for Directed Graphs. | 2 | 0.37 | 2015 |
Multi-task Sequence to Sequence Learning | 117 | 3.21 | 2015 |
Characterising Choiceless Polynomial Time with First-Order Interpretations | 2 | 0.41 | 2015 |
Adding Gradient Noise Improves Learning for Very Deep Networks. | 56 | 2.61 | 2015 |
Neural GPUs Learn Algorithms | 3 | 0.39 | 2015 |
Sentence Compression by Deletion with LSTMs | 50 | 1.62 | 2015 |
A Unified Approach to Boundedness Properties in MSO. | 1 | 0.36 | 2015 |
Model-Theoretic Properties of ω-Automatic Structures | 0 | 0.34 | 2014 |
Grammar as a Foreign Language. | 237 | 10.73 | 2014 |
Directed Width Measures and Monotonicity of Directed Graph Searching. | 1 | 0.35 | 2014 |
Experiments with reduction finding | 4 | 0.46 | 2013 |
Model Checking the Quantitative mu-Calculus on Linear Hybrid Systems | 1 | 0.36 | 2012 |
The Field of Reals is not omega-Automatic. | 2 | 0.38 | 2012 |
MODEL CHECKING THE QUANTITATIVE mu-CALCULUS ON LINEAR HYBRID SYSTEMS | 0 | 0.34 | 2012 |
Learning Games from Videos Guided by Descriptive Complexity. | 6 | 0.56 | 2012 |
Model Checking the Quantitative μ-Calculus on Linear Hybrid Systems. | 0 | 0.34 | 2011 |
Expressing cardinality quantifiers in monadic second-order logic over chains | 1 | 0.36 | 2011 |
Information Tracking in Games on Graphs | 15 | 0.71 | 2010 |
Expressing Cardinality Quantifiers in Monadic Second-Order Logic over Trees | 10 | 0.61 | 2010 |
New algorithm for weak monadic second-order logic on inductive structures | 4 | 0.47 | 2010 |
Cardinality quantifiers in MLO over trees | 2 | 0.43 | 2009 |
Directed Graphs of Entanglement Two | 4 | 0.44 | 2009 |
Synthesis for Structure Rewriting Systems | 2 | 0.42 | 2009 |
Cardinality and counting quantifiers on omega-automatic structures | 10 | 0.66 | 2008 |
Model Checking Games for the Quantitative mu-Calculus | 6 | 0.52 | 2008 |