Title
Exploring Psi-Mi Xml Collections Using Describex
Abstract
PSI-MI has been endorsed by the protein informatics community as a standard XML data exchange format for protein-protein interaction datasets. While many public databases support the standard, there is a degree of heterogeneity in the way the proposed XML schema is interpreted and instantiated by different data providers. Analysis of schema instantiation in large collections of XML data is a challenging task that is unsupported by existing tools.In this study we use DescribeX, a novel visualization technique of (semi-)structured XML formats, to quantitatively and qualitatively analyze PSI-MI XML collections at the instance level with the goal of gaining insights about schema usage and to study specific questions such as: adequacy of controlled vocabularies, detection of common instance patterns, and evolution of different data collections. Our analysis shows DescribeX enhances understanding the instance-level structure of PSI-MI data sources and is a useful tool for standards designers, software developers, and PSI-MI data providers.
Year
DOI
Venue
2007
10.2390/biecoll-jib-2007-70
JOURNAL OF INTEGRATIVE BIOINFORMATICS
Keywords
Field
DocType
data exchange,xml schema,protein protein interaction,controlled vocabulary,data collection,software development
Data mining,World Wide Web,Efficient XML Interchange,Streaming XML,XML Schema (W3C),XML validation,Computer science,Document Structure Description,XML database,XML schema,Bioinformatics,XML Schema Editor
Journal
Volume
Issue
ISSN
4
3
1613-4516
Citations 
PageRank 
References 
5
0.48
9
Authors
4
Name
Order
Citations
PageRank
Reza Samavi17312.60
Mariano P. Consens21203387.78
Shahan Khatchadourian31028.57
Thodoros Topaloglou4263177.45