Title
Deployment of a Multi-site Cloud Environment for Molecular Virtual Screenings
Abstract
With the constant increase in the number and variety of small molecule chemical compounds, drug discovery is becoming a very resource intensive endeavor. Performing molecular simulations of ligand-protein binding by virtual screening has become an integral part of the discovery process. Cloud computing is an efficient choice to execute these large-scale screenings, given that large compute allocations are not accessible to many researchers. This research focused on developing a multi-site cloud environment that combines small allocations of virtual machines in multiple locations connected through a virtual networking system (ViNe), and compared two parallelization approaches: Message Passing Interface (MPI) and MapReduce using Hadoop. Virtual screenings were conducted using DOCK, a protein-ligand molecular interaction simulation program. Multiple DOCK test simulations through MPI and Hadoop were run to assess the performance and flexibility of the environment. These tests indicated that MPI and MapReduce offer comparable scalability performance, and that network latency has a significant influence on low accuracy simulations. Furthermore, differences in performance at individual cloud resource sites were reduced on average because of the larger combined pool of resources. This project prototyped and assessed a fully functional multi-site cloud environment for virtual screenings, which can be used to guide small laboratories in deploying their own cloud-based screenings.
Year
DOI
Venue
2015
10.1109/eScience.2015.49
e-Science
Keywords
DocType
ISSN
Computational biochemistry, distributed computing, Hadoop, MPI, cloud computing
Conference
2325-372X
Citations 
PageRank 
References 
1
0.35
16
Authors
6
Name
Order
Citations
PageRank
Anthony Nguyen110.35
Andréa M. Matsunaga212311.47
Maurício O. Tsugawa311615.17
Susumu Date413328.14
Kohei Ichikawa56919.79
Haga, J.H.675.01