Abstract | ||
---|---|---|
Semantic data mining (SDM) uses annotated data and interconnected background knowledge to generate rules that are easily interpreted by the end user. However, the complexity of SDM algorithms is high, resulting in long running times even when applied to relatively small data sets. On the other hand, network analysis algorithms are among the most scalable data mining algorithms. This paper proposes an effective SDM approach that combines semantic data mining and network analysis. The proposed approach uses network analysis to extract the most relevant part of the interconnected background knowledge, and then applies a semantic data mining algorithm on the pruned background knowledge. The application on acute lymphoblastic leukemia data set demonstrates that the approach is well motivated, is more efficient and results in rules that are comparable or better than the rules obtained by applying the incorporated SDM algorithm without network reduction in data preprocessing. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1007/978-3-319-31744-1_65 | BIOINFORMATICS AND BIOMEDICAL ENGINEERING (IWBBIO 2016) |
Field | DocType | Volume |
Inductive logic programming,Data mining,Small data,Ranking,Computer science,Data pre-processing,Network analysis,Semantic computing,Semantic data model,Scalability | Conference | 9656 |
ISSN | Citations | PageRank |
0302-9743 | 2 | 0.36 |
References | Authors | |
5 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jan Kralj | 1 | 11 | 5.56 |
Anze Vavpetic | 2 | 52 | 6.49 |
Michel Dumontier | 3 | 898 | 93.35 |
Nada Lavrac | 4 | 2004 | 635.45 |