Name
Affiliation
Papers
WEIZHU CHEN
Microsoft Research Asia, Beijing, China
68
Collaborators
Citations 
PageRank 
221
597
38.77
Referers 
Referees 
References 
1375
737
456
Search Limit
1001000
Title
Citations
PageRank
Year
Finding the Dominant Winning Ticket in Pre-Trained Language Models00.342022
ALLSH: Active Learning Guided by Local Sensitivity and Hardness.00.342022
A Good Prompt Is Worth Millions of Parameters: Low-resource Prompt-based Learning for Vision-Language Models00.342022
MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation00.342022
PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance.00.342022
DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for Dialog Response Generation00.342022
CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing00.342022
Controllable Natural Language Generation with Contrastive Prefixes00.342022
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models00.342022
LoRA: Low-Rank Adaptation of Large Language Models00.342022
OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering00.342022
Adversarial Retriever-Ranker for Dense Text Retrieval.00.342022
What Makes Good In-Context Examples for GPT-3?00.342022
TAPEX: Table Pre-training via Learning a Neural SQL Executor00.342022
XLM-K: Improving Cross-Lingual Language Model Pre-training with Multilingual Knowledge.00.342022
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation00.342022
Token-wise Curriculum Learning for Neural Machine Translation.00.342021
MixKD: Towards Efficient Distillation of Large-scale Language Models00.342021
Adversarial Regularization as Stackelberg Game - An Unrolled Optimization Approach.00.342021
GLGE - A New General Language Generation Evaluation Benchmark.00.342021
Poolingformer: Long Document Modeling with Pooling Attention00.342021
Few-Shot Named Entity Recognition - An Empirical Baseline Study.00.342021
BANG: Bridging Autoregressive and Non-autoregressive Generation with Large Scale Pretraining00.342021
Finetuning Pretrained Transformers into RNNs.00.342021
ARCH - Efficient Adversarial Regularized Training with Caching.00.342021
NeurIPS 2020 EfficientQA Competition - Systems, Analyses and Lessons Learned.00.342021
Memory-Efficient Differentiable Transformer Architecture Search.00.342021
Reader-Guided Passage Reranking for Open-Domain Question Answering.00.342021
Contextual Bandit Applications in a Customer Support Bot10.432021
DeBERTa: Decoding-enhanced BERT with Disentangled Attention00.342021
On the Variance of the Adaptive Learning Rate and Beyond40.432020
Understanding the Difficulty of Training Transformers.00.342020
Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning.00.342020
The Microsoft Toolkit Of Multi-Task Deep Neural Networks For Natural Language Understanding00.342020
Parameter-free Sentence Embedding via Orthogonal Basis10.352019
Lessons from Real-World Reinforcement Learning in a Customer Support Bot.00.342019
Learning to Attend On Essential Terms: An Enhanced Retriever-Reader Model for Open-domain Question Answering.10.352019
Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding.40.392019
Learning to Attend On Essential Terms: An Enhanced Retriever-Reader Model for Scientific Question Answering.20.362018
Zero-training Sentence Embedding via Orthogonal Basis.00.342018
IncSQL: Training Incremental Text-to-SQL Parsers with Non-Deterministic Oracles.20.362018
Limited-memory Common-directions Method for Distributed Optimization and its Application on Empirical Risk Minimization.10.362017
FusionNet: Fusing via Fully-Aware Attention with Application to Machine Comprehension.180.662017
ReasoNet: Learning to Stop Reading in Machine Comprehension.642.102016
Large-scale L-BFGS using MapReduce.120.752014
Transfer Understanding from Head Queries to Tail Queries60.402014
Beyond ten blue links: enabling user click modeling in federated web search360.992012
A noise-aware click model for web search100.562012
Personalized click model through collaborative filtering270.882012
Short text conceptualization using a probabilistic knowledgebase963.222011
  • 1
  • 2