Abstract | ||
---|---|---|
While there are an increasing number of genomes (including the human genome) whose sequences have been fully or nearly completed, the budding yeast Saccharomyces cerevisiae was the first fully sequenced eukaryotic genome. Given its ease of genetic manipulation and the fact that many of its genes are strikingly similar to human genes, the yeast genome has been studied extensively through a wide range of biological experiments (e.g., microarray experiments). As a result, a large variety of types of yeast genome data have been generated and made accessible through many resources (e.g., SGD, MIPS, and YPD). While these resources serve many specific needs of individual researchers, we can reap more benefits by integrating these disparate datasets to facilitate larger-context data mining. However, such integrated analysis is hampered by the heterogeneous formats that are used for data distribution. With the increasing use of eXtensible Mark Language (XML) in the bioinformatics domain, we demonstrate how to use XML to standardize the exchange of a variety of types of yeast data between different resources. In particular, we propose a standard XML format called "Yeast Hub XML" (YHX). This format consists of: i) metadata and ii) data. While the former describes the resource and data structure, the latter is used to represent the data. In addition, we apply various XML-related technologies including XPath and XSLT to query, integrate, and transform multiple XML datasets. We have implemented a prototype yeast hub server that allows sharing, querying, and integration of different types and formats of yeast genome data that are located in disparate sources. |
Year | Venue | Keywords |
---|---|---|
2004 | METMBS '04: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MATHEMATICS AND ENGINEERING TECHNIQUES IN MEDICINE AND BIOLOGICAL SCIENCES | data mining,data structure,genetics,human genome |
Field | DocType | Citations |
Genome,XML,Computer science,Yeast,Computational biology,Genetics | Conference | 1 |
PageRank | References | Authors |
0.73 | 12 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Kei-hoi Cheung | 1 | 664 | 60.65 |
Deyun Pan | 2 | 34 | 3.71 |
Andrew Smith | 3 | 27 | 6.55 |
Michael Seringhaus | 4 | 10 | 1.42 |
Shawn M. Douglas | 5 | 1 | 1.07 |
Mark Gerstein | 6 | 354 | 45.41 |