Title
Kformer: Knowledge Injection in Transformer Feed-Forward Layers
Abstract
Recent days have witnessed a diverse set of knowledge injection models for pre-trained language models (PTMs); however, most previous studies neglect the PTMs' own ability with quantities of implicit knowledge stored in parameters. A recent study [2] has observed knowledge neurons in the Feed Forward Network (FFN), which are responsible for expressing factual knowledge. In this work, we propose a simple model, Kformer, which takes advantage of the knowledge stored in PTMs and external knowledge via knowledge injection in Transformer FFN layers. Empirically results on two knowledge-intensive tasks, commonsense reasoning (i.e., SocialIQA) and medical question answering (i.e., MedQA-USMLE), demonstrate that Kformer can yield better performance than other knowledge injection technologies such as concatenation or attention-based injection. We think the proposed simple model and empirical findings may be helpful for the community to develop more powerful knowledge injection methods1 (Code available in https:// github.com/zjunlp/Kformer).
Year
DOI
Venue
2022
10.1007/978-3-031-17120-8_11
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT I
Keywords
DocType
Volume
Transformer, Feed Forward Network, Knowledge injection
Conference
13551
ISSN
Citations 
PageRank 
0302-9743
0
0.34
References 
Authors
0
6
Name
Order
Citations
PageRank
Yunzhi Yao101.01
Shaohan Huang25710.29
Li Dong300.34
Li Dong458231.86
Huanhuan Chen5731101.79
Ningyu Zhang66318.56