STABLEMOE: Stable Routing Strategy for Mixture of Experts | 0 | 0.34 | 2022 |
Swin Transformer V2: Scaling Up Capacity and Resolution | 0 | 0.34 | 2022 |
Controllable Natural Language Generation with Contrastive Prefixes | 0 | 0.34 | 2022 |
Kformer: Knowledge Injection in Transformer Feed-Forward Layers | 0 | 0.34 | 2022 |
Knowledge Neurons in Pretrained Transformers | 0 | 0.34 | 2022 |
XLM-E: Cross-lingual Language Model Pre-training via ELECTRA | 2 | 0.36 | 2022 |
CLIP Models are Few-shot Learners: Empirical Studies on VQA and Visual Entailment | 0 | 0.34 | 2022 |
THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption | 0 | 0.34 | 2022 |
Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task. | 0 | 0.34 | 2021 |
Adapt-and-Distill - Developing Small, Fast and Effective Pretrained Language Models for Domains. | 0 | 0.34 | 2021 |
Self-Attention Attribution: Interpreting Information Interactions Inside Transformer | 0 | 0.34 | 2021 |
Learning to Sample Replacements for ELECTRA Pre-Training. | 0 | 0.34 | 2021 |
Allocating Large Vocabulary Capacity for Cross-Lingual Language Model Pre-Training. | 0 | 0.34 | 2021 |
mT6 - Multilingual Pretrained Text-to-Text Transformer with Translation Pairs. | 0 | 0.34 | 2021 |
Zero-Shot Cross-Lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders. | 0 | 0.34 | 2021 |
Learning natural language interfaces with neural models | 0 | 0.34 | 2021 |
InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training | 0 | 0.34 | 2021 |
MiniLMv2 - Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers. | 0 | 0.34 | 2021 |
Memory-Efficient Differentiable Transformer Architecture Search. | 0 | 0.34 | 2021 |
Can Monolingual Pretrained Models Help Cross-Lingual Classification? | 0 | 0.34 | 2020 |
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training | 0 | 0.34 | 2020 |
Harvesting and Refining Question-Answer Pairs for Unsupervised QA | 0 | 0.34 | 2020 |
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers | 0 | 0.34 | 2020 |
Investigating Learning Dynamics of BERT Fine-Tuning. | 0 | 0.34 | 2020 |
Cross-Lingual Natural Language Generation Via Pre-Training | 0 | 0.34 | 2020 |
Unified Language Model Pre-training for Natural Language Understanding and Generation. | 8 | 0.44 | 2019 |
Visualizing and Understanding the Effectiveness of BERT | 7 | 0.49 | 2019 |
Multitask learning for biomedical named entity recognition with cross-sharing structure. | 0 | 0.34 | 2019 |
Inspecting Unification of Encoding and Matching with Transformer: A Case Study of Machine Reading Comprehension | 0 | 0.34 | 2019 |
Confidence Modeling For Neural Semantic Parsing | 1 | 0.35 | 2018 |
Coarse-To-Fine Decoding For Neural Semantic Parsing | 14 | 0.51 | 2018 |
Data-to-Text Generation with Content Selection and Planning | 3 | 0.39 | 2018 |
Proactive Resource Management for LTE in Unlicensed Spectrum: A Deep Learning Perspective. | 13 | 0.64 | 2018 |
Learning to Generate Product Reviews from Attributes. | 19 | 0.66 | 2017 |
Learning to Paraphrase for Question Answering. | 13 | 0.55 | 2017 |
Unsupervised Word and Dependency Path Embeddings for Aspect Term Extraction. | 19 | 0.70 | 2016 |
Long Short-Term Memory-Networks for Machine Reading. | 106 | 3.45 | 2016 |
Solving and Generating Chinese Character Riddles. | 0 | 0.34 | 2016 |
Adaptive Multi-Compositionality for Recursive Neural Network Models. | 4 | 0.41 | 2016 |
Splusplus: A Feature-Rich Two-stage Classifier for Sentiment Analysis of Tweets | 4 | 0.45 | 2015 |
A joint segmentation and classification framework for sentence level sentiment classification | 19 | 0.59 | 2015 |
A hybrid neural model for type classification of entity mentions | 7 | 0.45 | 2015 |
A statistical parsing framework for sentiment classification | 12 | 1.51 | 2015 |
Question Answering Over Freebase With Multi-Column Convolutional Neural Networks | 78 | 1.94 | 2015 |
Ranking With Recursive Neural Networks And Its Application To Multi-Document Summarization | 40 | 1.25 | 2015 |
Adaptive Recursive Neural Network For Target-Dependent Twitter Sentiment Classification | 56 | 1.53 | 2014 |
A Joint Segmentation and Classification Framework for Sentiment Analysis. | 3 | 0.38 | 2014 |
The Automated Acquisition of Suggestions from Tweets. | 7 | 0.58 | 2013 |
Unraveling The Origin Of Exponential Law In Intra-Urban Human Mobility | 31 | 1.43 | 2013 |
MoodLens: an emoticon-based sentiment analysis system for chinese tweets | 112 | 3.27 | 2012 |