Title
Learning Rare Word Representations using Semantic Bridging.
Abstract
We propose a methodology that adapts graph embedding techniques (DeepWalk (Perozzi et al., 2014) and node2vec (Grover and Leskovec, 2016)) as well as cross-lingual vector space mapping approaches (Least Squares and Canonical Correlation Analysis) in order to merge the corpus and ontological sources of lexical knowledge. We also perform comparative analysis of the used algorithms in order to identify the best combination for the proposed system. We then apply this to the task of enhancing the coverage of an existing word embedding's vocabulary with rare and unseen words. We show that our technique can provide considerable extra coverage (over 99%), leading to consistent performance gain (around 10% absolute gain is achieved with w2v-gn-500K cf.\S 3.3) on the Rare Word Similarity dataset.
Year
Venue
DocType
2017
CoRR
Journal
Volume
Citations 
PageRank 
abs/1707.07554
0
0.34
References 
Authors
0
5
Name
Order
Citations
PageRank
Victor Prokhorov122.38
Mohammad Taher Pilehvar237625.70
Dimitri Kartsaklis320415.08
Pietro Lió400.34
Nigel Collier5185.07