Title | ||
---|---|---|
ETNLP: A Toolkit for Extraction, Evaluation and Visualization of Pre-trained Word Embeddings. |
Abstract | ||
---|---|---|
In this paper, we introduce a comprehensive toolkit, ETNLP, which can evaluate, extract, and visualize multiple sets of pre-trained word embeddings. First, for evaluation, ETNLP analyses the quality of pre-trained embeddings based on an input word analogy list. Second, for extraction ETNLP provides a subset of the embeddings to be used in the downstream NLP tasks. Finally, ETNLP has a visualization module which is for exploring the embedded words interactively. We demonstrate the effectiveness of ETNLP on our pre-trained word embeddings in Vietnamese. Specifically, we create a large Vietnamese word analogy list to evaluate the embeddings. We then utilize the pre-trained embeddings for the name entity recognition (NER) task in Vietnamese and achieve the new state-of-the-art results on a benchmark dataset for the NER task. A video demonstration of ETNLP is available at https://vimeo.com/317599106. The source code and data are available at https: //github.com/vietnlp/etnlp. |
Year | Venue | DocType |
---|---|---|
2019 | arXiv: Computation and Language | Journal |
Volume | Citations | PageRank |
abs/1903.04433 | 0 | 0.34 |
References | Authors | |
19 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Xuan-Son Vu | 1 | 6 | 5.88 |
Thanh Vu | 2 | 40 | 6.87 |
Son N. Tran | 3 | 8 | 4.22 |
Lili Jiang | 4 | 34 | 9.18 |