Title
On the Reachability of Trustworthy Information from Integrated Exploratory Biological Queries
Abstract
Levels of curation across biological databases are widely recognized as being highly variable, depending on provenance and type. In spite of ambiguous quality, searches against biological sources, such as those for sequence homology, remain a frontline strategy for biomedical scientists studying molecular data. In the following, we investigate the accessibility of well-curated data retrieved from explorative queries across multiple sources. We present the architecture and design of a lightweight data integration platform conducible to graph-theoretic analysis. Using data collected via this framework, we examine the reachability of evidence-supported annotations across triangulated sources in the face of uncertainty, using a simple random sampling model oriented around fault tolerance. We characterize the accessibility of high-quality data from uncertain queries and levels of redundancy across data sources and find that generally encountering non-experimentally verified annotations are nearly as likely as encountering experimentally verified annotations, with the exception of a group of proteins whose link structure is dominated by experimental evidence. Finally, we discuss the prospect of determining overall accessibility of relevant information based on metadata about a query and its results.
Year
DOI
Venue
2009
10.1007/978-3-642-02879-3_6
DILS
Keywords
Field
DocType
trustworthy information,integrated exploratory biological queries,ambiguous quality,biological databases,encountering non-experimentally,biological source,well-curated data,molecular data,high-quality data,data source,overall accessibility,lightweight data integration platform,biological database,data integrity,fault tolerant,data retrieval,data collection,simple random sampling
Data integration,Metadata,Data mining,Simple random sample,Computer science,Biological database,Reachability,Fault tolerance,Redundancy (engineering),Triangulation,Database
Conference
Volume
ISSN
Citations 
5647
0302-9743
2
PageRank 
References 
Authors
0.36
11
3
Name
Order
Citations
PageRank
Eithon Cadag123513.60
P Tarczy-Hornoch255353.80
Peter J. Myler3584.78