Abstract | ||
---|---|---|
PSI-MI has been endorsed by the protein informatics community as a standard XML data exchange format for protein-protein interaction datasets. While many public databases support the standard, there is a degree of heterogeneity in the way the proposed XML schema is interpreted and instantiated by different data providers. Analysis of schema instantiation in large collections of XML data is a challenging task that is unsupported by existing tools.In this study we use DescribeX, a novel visualization technique of (semi-)structured XML formats, to quantitatively and qualitatively analyze PSI-MI XML collections at the instance level with the goal of gaining insights about schema usage and to study specific questions such as: adequacy of controlled vocabularies, detection of common instance patterns, and evolution of different data collections. Our analysis shows DescribeX enhances understanding the instance-level structure of PSI-MI data sources and is a useful tool for standards designers, software developers, and PSI-MI data providers. |
Year | DOI | Venue |
---|---|---|
2007 | 10.2390/biecoll-jib-2007-70 | JOURNAL OF INTEGRATIVE BIOINFORMATICS |
Keywords | Field | DocType |
data exchange,xml schema,protein protein interaction,controlled vocabulary,data collection,software development | Data mining,World Wide Web,Efficient XML Interchange,Streaming XML,XML Schema (W3C),XML validation,Computer science,Document Structure Description,XML database,XML schema,Bioinformatics,XML Schema Editor | Journal |
Volume | Issue | ISSN |
4 | 3 | 1613-4516 |
Citations | PageRank | References |
5 | 0.48 | 9 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Reza Samavi | 1 | 73 | 12.60 |
Mariano P. Consens | 2 | 1203 | 387.78 |
Shahan Khatchadourian | 3 | 102 | 8.57 |
Thodoros Topaloglou | 4 | 263 | 177.45 |