Title
BioDIFF: an effective fast change detection algorithm for biological annotations
Abstract
Warehousing heterogeneous, dynamic biological data is a key technique for biological data integration as it greatly improves performance. However, it requires complex maintenance procedures to update the warehouse in light of the changes to the sources. Consequently, a key issue to address is how to detect changes to the underlying biological data sources. In this paper, we present an algorithm called BIODIFF for detecting exact changes to biological annotations. In our approach we transform heterogeneous biological data to XML format and then detect changes between two versions of XML representation of biological data. Our algorithm extends X-Diff, a published XML change detection algorithm. X-Diff, being designed for any type of XML data, does not exploit the semantics of biological data to reduce the data set of bipartite mapping. We have implemented BIODIFF in Java. We have conducted an extensive performance study using data from EMBL, GenBank, SwissProt and PDB. Our experimental results show that BIODIFF runs 1.5 to 6 times faster than X-Diff.
Year
DOI
Venue
2007
10.1007/978-3-540-71703-4_25
DASFAA
Keywords
Field
DocType
xml format,xml data,xml change detection algorithm,underlying biological data source,biological data,xml representation,biological annotation,heterogeneous biological data,effective fast change detection,dynamic biological data,biological data integration,change detection
Biological data,Data mining,XML,Information retrieval,Computer science,Bipartite graph,Exploit,Java,Change detection algorithms,GenBank,Semantics,Database
Conference
Volume
ISSN
Citations 
4443
0302-9743
1
PageRank 
References 
Authors
0.41
11
3
Name
Order
Citations
PageRank
Yang Song17122.04
Sourav S. Bhowmick21519272.35
C. Forbes Dewey, Jr.3122.60