Title
ETNLP: A Toolkit for Extraction, Evaluation and Visualization of Pre-trained Word Embeddings.
Abstract
In this paper, we introduce a comprehensive toolkit, ETNLP, which can evaluate, extract, and visualize multiple sets of pre-trained word embeddings. First, for evaluation, ETNLP analyses the quality of pre-trained embeddings based on an input word analogy list. Second, for extraction ETNLP provides a subset of the embeddings to be used in the downstream NLP tasks. Finally, ETNLP has a visualization module which is for exploring the embedded words interactively. We demonstrate the effectiveness of ETNLP on our pre-trained word embeddings in Vietnamese. Specifically, we create a large Vietnamese word analogy list to evaluate the embeddings. We then utilize the pre-trained embeddings for the name entity recognition (NER) task in Vietnamese and achieve the new state-of-the-art results on a benchmark dataset for the NER task. A video demonstration of ETNLP is available at https://vimeo.com/317599106. The source code and data are available at https: //github.com/vietnlp/etnlp.
Year
Venue
DocType
2019
arXiv: Computation and Language
Journal
Volume
Citations 
PageRank 
abs/1903.04433
0
0.34
References 
Authors
19
4
Name
Order
Citations
PageRank
Xuan-Son Vu165.88
Thanh Vu2406.87
Son N. Tran384.22
Lili Jiang4349.18