Abstract | ||
---|---|---|
Mining entity synonym sets (i.e., sets of terms referring to the same entity) is an important task for many entity-leveraging applications. Previous work either rank terms based on their similarity to a given query term, or treats the problem as a two-phase task (i.e., detecting synonymy pairs, followed by organizing these pairs into synonym sets). However, these approaches fail to model the holistic semantics of a set and suffer from the error propagation issue. Here we propose a new framework, named SynSetMine, that efficiently generates entity synonym sets from a given vocabulary, using example sets from external knowledge bases as distant supervision. SynSetMine consists of two novel modules: (1) a set-instance classifier that jointly learns how to represent a permutation invariant synonym set and whether to include a new instance (i.e., a term) into the set, and (2) a set generation algorithm that enumerates the vocabulary only once and applies the learned set-instance classifier to detect all entity synonym sets in it. Experiments on three real datasets from different domains demonstrate both effectiveness and efficiency of SynSetMine for mining entity synonym sets. |
Year | Venue | Field |
---|---|---|
2018 | national conference on artificial intelligence | Propagation of uncertainty,Computer science,Synonym,Permutation,Invariant (mathematics),Artificial intelligence,Natural language processing,Classifier (linguistics),Vocabulary,Semantics |
DocType | Volume | Citations |
Journal | abs/1811.07032 | 2 |
PageRank | References | Authors |
0.38 | 23 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jiaming Shen | 1 | 23 | 3.21 |
Ruiliang Lyu | 2 | 2 | 0.38 |
Xiang Ren | 3 | 885 | 60.08 |
Michelle Vanni | 4 | 51 | 10.16 |
Brian M. Sadler | 5 | 3179 | 286.72 |
Jiawei Han | 6 | 43085 | 3824.48 |