Abstract | ||
---|---|---|
This paper proposes a framework to automatically construct lightweight ontology from a corpus of Chinese domain Web documents. A hybrid-based method was used for domain lightweight ontology learning. Rule-based method, statistics-based method and cluster-based method were combined to complete two sub-tasks: concept extraction and taxonomic relationships extraction. Firstly, multiword terms were identified based on a set of rules as well as a Named Entity Module. Three statistic methods were employed jointly to rank the order of domain concepts. Secondly, clustering and subsumption methods were joined to construct taxonomy. Concepts were clustered into several groups through clustering method. Three similarity measures were defined to compute similarities between concepts, which aims at capturing semantic, spatial, and co-occurrence information. Subsumption method was adopted to construct taxonomic structure for each concept group, since taxonomic relations only existed between similar concepts. Thirdly, the definitions of the concepts extracted in the first step are collected from online Chinese Encyclopedia. On this collection of concept definitions, the rule-based method and a set of lexico-syntactic patterns were applied to extract taxonomic relationships and refine the taxonomy. Finally, we evaluate our method using gold-standard evaluation on domain of football games. In our evaluation, we compare our method with several classical algorithms. The experimental results show the effectiveness of our method. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1007/s13042-017-0661-0 | Int. J. Machine Learning & Cybernetics |
Keywords | Field | DocType |
Ontology learning, Concept extraction, Taxonomic relationships extraction, Hybrid-based method | Lightweight ontology,Data mining,Statistic,Computer science,Named entity,Artificial intelligence,Encyclopedia,Natural language processing,Concept extraction,Cluster analysis,Ontology learning | Journal |
Volume | Issue | ISSN |
9 | 9 | 1868-8071 |
Citations | PageRank | References |
0 | 0.34 | 48 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jing Qiu | 1 | 60 | 14.01 |
Lin Qi | 2 | 10 | 7.28 |
Jianliang Wang | 3 | 0 | 0.34 |
Zhang Guanghua | 4 | 0 | 1.35 |