Abstract | ||
---|---|---|
As researchers analyze huge amounts of data that are annotated by large biomedical ontologies, one of the major challenges for data mining and machine learning is to leverage both ontologies and data together in a systematic and scalable way. In this paper, we address two interesting and related problems for mining biomedical ontologies and data: i) how to discover semantic associations with the help of formal ontologies, ii) how to identify potential errors in the ontologies with the help of data. By representing both ontologies and data using RDF hyper graphs, and subsequently transforming the hyper graphs to corresponding bipartite forms, we provide a generalized data mining method that scales beyond what existing ontology-based approaches can provide. We show the proposed method is indeed capable of capturing semantic associations while seamlessly incorporate domain knowledge in ontologies by performing evaluations on real-world electronic health dataset and NCBO ontologies. We also show that our data mining methods can discover and suggest corrections for misinformation in biomedical ontologies. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1109/ICMLA.2013.31 | ICMLA (1) |
Keywords | Field | DocType |
data mining,rdf hyper graph,biomedical ontology,data mining method,mining biomedical ontologies,generalized data mining method,hyper graph,ncbo ontology,large biomedical ontology,semantic association,rdf hypergraphs,learning artificial intelligence,graph theory | Ontology (information science),Data science,Ontology,Domain knowledge,Information retrieval,Computer science,Open Biomedical Ontologies,IDEF5,Ontology components,RDF,Web Ontology Language | Conference |
Citations | PageRank | References |
5 | 0.42 | 15 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Haishan Liu | 1 | 68 | 6.22 |
Dejing Dou | 2 | 892 | 90.86 |
Ruoming Jin | 3 | 1637 | 91.73 |
Paea LePendu | 4 | 294 | 21.32 |
Nigam Shah | 5 | 212 | 20.11 |