Title
mwetoolkit+sem: Integrating Word Embeddings in the mwetoolkit for Semantic MWE Processing.
Abstract
This paper presents mwetoolkit+sem: an extension of the mwetoolkit that estimates semantic compositionality scores for multiword expressions (MWEs) based on word embeddings. First, we describe our implementation of vector-space operations working on distributional vectors. The compositionality score is based on the cosine distance between the MWE vector and the composition of the vectors of its member words. Our generic system can handle several types of word embeddings and MWE lists, and may combine individual word representations using several composition techniques. We evaluate our implementation on a dataset of 1042 English noun compounds (Farahmand et al., 2015), comparing different configurations of the underlying word embeddings and word-composition models. We show that our vector-based scores model non-compositionality better than standard association measures such as log-likelihood.
Year
Venue
Keywords
2016
LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION
Lexical semantics,multiword expressions,compositionality,word embeddings
Field
DocType
Citations 
Computer science,Speech recognition,Natural language processing,Artificial intelligence
Conference
0
PageRank 
References 
Authors
0.34
13
3
Name
Order
Citations
PageRank
Silvio Cordeiro172.44
carlos ramisch216122.91
Aline Villavicencio328635.24