Title
XML schemas for common bioinformatic data types and their application in workflow systems.
Abstract
BACKGROUND: Today, there is a growing need in bioinformatics to combine available software tools into chains, thus building complex applications from existing single-task tools. To create such workflows, the tools involved have to be able to work with each other's data – therefore, a common set of well-defined data formats is needed. Unfortunately, current bioinformatic tools use a great variety of heterogeneous formats. RESULTS: Acknowledging the need for common formats, the Helmholtz Open BioInformatics Technology network (HOBIT) identified several basic data types used in bioinformatics and developed appropriate format descriptions, formally defined by XML schemas, and incorporated them in a Java library (BioDOM). These schemas currently cover sequence, sequence alignment, RNA secondary structure and RNA secondary structure alignment formats in a form that is independent of any specific program, thus enabling seamless interoperation of different tools. All XML formats are available at http://bioschemas.sourceforge.net, the BioDOM library can be obtained at http://biodom.sourceforge.net. CONCLUSION: The HOBIT XML schemas and the BioDOM library simplify adding XML support to newly created and existing bioinformatic tools, enabling these tools to interoperate seamlessly in workflow scenarios.
Year
DOI
Venue
2006
10.1186/1471-2105-7-490
BMC Bioinformatics
Keywords
Field
DocType
sequence alignment,xml schema,data type,rna secondary structure,algorithms,bioinformatics,microarrays
World Wide Web,XML,Computer science,Interoperability,Interoperation,XML schema,Data type,Bioinformatics,Workflow,cXML,Document type definition
Journal
Volume
Issue
ISSN
7
1
1471-2105
Citations 
PageRank 
References 
41
1.25
25
Authors
8
Name
Order
Citations
PageRank
Philipp N. Seibel1782.49
Jan Krüger2768.32
Sven Hartmeier3862.18
Knut Schwarzer4864.61
Kai Löwenthal5411.25
Henning Mersch6584.24
Thomas Dandekar741123.17
Robert Giegerich81616130.26