Title
Grid-based Data Access to Nucleotide Sequence Database
Abstract
The International Nucleotide Sequence Database Collaboration (INSDC) exchanges sequence data on a daily basis across its three member organizations in the USA, UK and Japan. This paper studies how this sequence database in MySQL can best take advantage of the increased transfer bandwidth of a Grid-optimized data communication protocol. Within the context of the UK Government Project Grid-oriented Storage (GOS) and the EC Project EuroAsiaGrid, GOS File System (GOS-FS) has been developed in our lab, which melds distributed file system technology with high performance data transfer techniques to meet the needs of WAN/Grid-based virtual organizations. A real-world test shows that the INSDC sequence database backing up operation, mysqldump, over the GOS-FS protocol beats those over the classic NFS protocol by 6 times over the link between Cambridge and Tokyo. Best of all, the multi-streamed GOS-FS protocol remains fully compatible with existing IP infrastructures.
Year
DOI
Venue
2007
10.1007/s00354-007-0026-4
New Generation Comput.
Keywords
Field
DocType
Nucleotide Sequence Database,Grid Computing,Life Science,Grid-based Data Access,MySQL
Distributed File System,SSH File Transfer Protocol,File system,Sequence database,Grid computing,Computer science,Data access,Database,Network File System,Communications protocol
Journal
Volume
Issue
ISSN
25
4
0288-3635
Citations 
PageRank 
References 
1
0.35
3
Authors
9
Name
Order
Citations
PageRank
Frank Zhigang Wang1346.87
Sining Wu2397.14
Na Helian36015.31
Zhiwei Xu41563162.88
Yuhui Deng533139.56
Vineet R. Khare624016.13
Chenhan Liao7141.37
Chris Thompson83329.48
Michael Parker9163.54