Title
Massively Multilingual Word Embeddings.
Abstract
We introduce new methods for estimating and evaluating embeddings of words in more than fifty languages in a single shared embedding space. Our estimation methods, multiCluster and multiCCA, use dictionaries and monolingual data; they do not require parallel data. Our new evaluation method, multiQVEC-CCA, is shown to correlate better than previous ones with two downstream tasks (text categorization and parsing). We also describe a web portal for evaluation that will facilitate further research in this area, along with open-source releases of all our methods.
Year
Venue
Field
2016
arXiv: Computation and Language
Embedding,Computer science,Natural language processing,Artificial intelligence,Parsing,Text categorization,Machine learning
DocType
Volume
Citations 
Journal
abs/1602.01925
45
PageRank 
References 
Authors
1.33
24
6
Name
Order
Citations
PageRank
Waleed Ammar133918.48
George Mulcaire2893.42
Yulia Tsvetkov334733.83
Guillaume Lample465122.75
chris dyer55438232.28
Noah A. Smith65867314.27