Abstract | ||
---|---|---|
While there has been much research on automatically constructing structured Knowledge Bases (KBs), most of it has focused on generating facts to populate a KB. However, a useful KB must go beyond facts. For example, glosses (short natural language definitions) have been found to be very useful in tasks such as Word Sense Disambiguation. However, the important problem of Automatic Gloss Finding, i.e., assigning glosses to entities in an initially gloss-free KB, is relatively unexplored. We address that gap in this paper. In particular, we propose GLOFIN, a hierarchical semi-supervised learning algorithm for this problem which makes effective use of limited amounts of supervision and available ontological constraints. To the best of our knowledge, GLOFIN is the first system for this task.Through extensive experiments on real-world datasets, we demonstrate GLOFIN's effectiveness. It is encouraging to see that GLOFIN outperforms other state-of-the-art SSL algorithms, especially in low supervision settings. We also demonstrate GLOFIN's robustness to noise through experiments on a wide variety of KBs, ranging from user contributed (e.g., Freebase) to automatically constructed (e.g., NELL). To facilitate further research in this area, we have made the datasets and code used in this paper publicly available. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1145/2684822.2685288 | WSDM |
Keywords | Field | DocType |
web mining | Data mining,Ontology,Web mining,Information retrieval,Computer science,Robustness (computer science),Natural language,Knowledge base,Word-sense disambiguation | Conference |
Citations | PageRank | References |
12 | 0.59 | 39 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Bhavana Bharat Dalvi | 1 | 201 | 17.31 |
Einat Minkov | 2 | 441 | 29.04 |
Partha Pratim Talukdar | 3 | 980 | 65.47 |
William W. Cohen | 4 | 10178 | 1243.74 |