Abstract | ||
---|---|---|
Due to the performance improvements they provided in natural language processing (NLP) applications, word embeddings are commonly studied and used. The algorithms that generate word embeddings, learn low dimensional, dense vector spaces that encode semantic relations among words in an unsupervised manner from large unannotated corpora. However, these vector spaces usually do not have interpretable dimensions making their semantic structure more challenging to be comprehended by the researchers. To have a better understanding of the inner structures of the word embeddings and further improve their utility, learning new, interpretable word embeddings is an active research area. In this study, a semantic category dataset (ANKAT) that contains more than 4000 unique Turkish words grouped under 62 different categories is composed to quantitatively evaluate the interpretability of the word embeddings. An interpretability analysis method based on this dataset is proposed and tested on five different embedding spaces. |
Year | Venue | Keywords |
---|---|---|
2018 | Signal Processing and Communications Applications Conference | Kelime Temsilleri,Yorumlanabilirlik,Anlamsal Yapi,Dogal Dil Isleme |
Field | DocType | ISSN |
Interpretability,ENCODE,Turkish,Vector space,Embedding,Pattern recognition,Computer science,Natural language processing,Artificial intelligence,Encyclopedia,Semantics | Conference | 2165-0608 |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Lutfi Kerem Senel | 1 | 6 | 1.11 |
Veysel Yücesoy | 2 | 2 | 3.09 |
Aykut Koc | 3 | 12 | 9.01 |
Tolga Çukur | 4 | 36 | 8.84 |