Title
Revisiting Text And Knowledge Graph Joint Embeddings: The Amount Of Shared Information Matters!
Abstract
Jointly learning embeddings from text and a Knowledge Graph benefits both word and entity/relation embeddings by taking advantage of both large-scale unstructured content (text) and high-quality structured data (the Knowledge Graph). Current techniques leverage anchors to associate entities in the Knowledge Graph to corresponding words in the text corpus; these anchors are then used to generate additional learning samples during the embedding learning process. However, we show in this paper that such techniques yield suboptimal results, as they fail to control the amount of shared information between the two data sources during the joint learning process. Moreover, the additional learning samples often incur significant computational overhead. Aiming at releasing the power of such joint embeddings, we propose JOINER, a new joint text and Knowledge Graph embedding method using regularization. JOINER not only preserves co-occurrence between words in a text corpus and relations between entities in a Knowledge Graph, it also provides the flexibility to control the amount of information shared between the two data sources via regularization. Our method does not generate additional learning samples, which makes it computationally efficient. Our extensive empirical evaluation on real datasets shows the superiority of JOINER across different evaluation tasks, including analogical reasoning, link prediction, and relation extraction. Compared to state-of-the-art techniques generating additional learning samples from a set of anchors, our method yields better results (with up to 4.3% absolute improvement) and significantly less computational overhead (76% less learning time overhead).
Year
DOI
Venue
2019
10.1109/BigData47090.2019.9005462
2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA)
Keywords
Field
DocType
Word embeddings, Knowledge Graph embeddings, regularization
Overhead (computing),Analogical reasoning,Knowledge graph,Embedding,Computer science,Text corpus,Regularization (mathematics),Artificial intelligence,Data model,Machine learning,Relationship extraction
Conference
ISSN
Citations 
PageRank 
2639-1589
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Paolo Rosso192.19
Dingqi Yang254228.79
o de troyer31708134.92