Kformer: Knowledge Injection in Transformer Feed-Forward Layers - Citegraph

Paper Info

Title
Kformer: Knowledge Injection in Transformer Feed-Forward Layers

Abstract
Recent days have witnessed a diverse set of knowledge injection models for pre-trained language models (PTMs); however, most previous studies neglect the PTMs' own ability with quantities of implicit knowledge stored in parameters. A recent study [2] has observed knowledge neurons in the Feed Forward Network (FFN), which are responsible for expressing factual knowledge. In this work, we propose a simple model, Kformer, which takes advantage of the knowledge stored in PTMs and external knowledge via knowledge injection in Transformer FFN layers. Empirically results on two knowledge-intensive tasks, commonsense reasoning (i.e., SocialIQA) and medical question answering (i.e., MedQA-USMLE), demonstrate that Kformer can yield better performance than other knowledge injection technologies such as concatenation or attention-based injection. We think the proposed simple model and empirical findings may be helpful for the community to develop more powerful knowledge injection methods1 (Code available in https:// github.com/zjunlp/Kformer).

Year	DOI	Venue
2022	10.1007/978-3-031-17120-8_11	NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT I
Keywords	DocType	Volume
Transformer, Feed Forward Network, Knowledge injection	Conference	13551
ISSN	Citations	PageRank
0302-9743	0	0.34
References	Authors
0	6

Authors (6 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Yunzhi Yao	1	0	1.01
Shaohan Huang	2	57	10.29
Li Dong	3	0	0.34
Li Dong	4	582	31.86
Huanhuan Chen	5	731	101.79
Ningyu Zhang	6	63	18.56

1