Title
Blind prediction of cyclohexane–water distribution coefficients from the SAMPL5 challenge
Abstract
In the recent SAMPL5 challenge, participants submitted predictions for cyclohexane/water distribution coefficients for a set of 53 small molecules. Distribution coefficients (log D) replace the hydration free energies that were a central part of the past five SAMPL challenges. A wide variety of computational methods were represented by the 76 submissions from 18 participating groups. Here, we analyze submissions by a variety of error metrics and provide details for a number of reference calculations we performed. As in the SAMPL4 challenge, we assessed the ability of participants to evaluate not just their statistical uncertainty, but their model uncertainty—how well they can predict the magnitude of their model or force field error for specific predictions. Unfortunately, this remains an area where prediction and analysis need improvement. In SAMPL4 the top performing submissions achieved a root-mean-squared error (RMSE) around 1.5 kcal/mol. If we anticipate accuracy in log D predictions to be similar to the hydration free energy predictions in SAMPL4, the expected error here would be around 1.54 log units. Only a few submissions had an RMSE below 2.5 log units in their predicted log D values. However, distribution coefficients introduced complexities not present in past SAMPL challenges, including tautomer enumeration, that are likely to be important in predicting biomolecular properties of interest to drug discovery, therefore some decrease in accuracy would be expected. Overall, the SAMPL5 distribution coefficient challenge provided great insight into the importance of modeling a variety of physical effects. We believe these types of measurements will be a promising source of data for future blind challenges, especially in view of the relatively straightforward nature of the experiments and the level of insight provided.
Year
DOI
Venue
2016
https://doi.org/10.1007/s10822-016-9954-8
Journal of Computer-Aided Molecular Design
Keywords
Field
DocType
SAMPL,Distribution coefficient,Blind challenge,Free energy,Alchemical,Molecular simulation
Magnitude (mathematics),Computer science,Molecular simulation,Mean squared error,Statistics
Journal
Volume
Issue
ISSN
30
11
0920-654X
Citations 
PageRank 
References 
1
0.35
0
Authors
6
Name
Order
Citations
PageRank
Caitlin C. Bannan110.35
Kalistyn H. Burley210.35
Michael Chiu310.35
Michael R. Shirts41078.64
Michael K. Gilson570764.90
David L. Mobley621920.01