Title
Classifying the world anti-doping agency’s 2005 prohibited list using the chemistry development kit fingerprint
Abstract
We used the freely available Chemistry Development Kit (CDK) fingerprint to classify 5235 representative molecules taken from ten banned classes in the 2005 World Anti-Doping Agency’s (WADA) prohibited list, including molecules taken from the corresponding activity classes in the MDL Drug Data Report (MDDR). We used both Random Forest and k-Nearest Neighbours (kNN) algorithms to generate classifiers. The kNN classifiers withk = 1 gave a very slightly better Matthews Correlation Coefficient than the Random Forest classifiers; the latter, however, predicted fewer false positives. The performance of kNN classifiers tended to decline with increasing k. The performance of the CDK fingerprint is essentially equivalent to that of Unity 2D. Our results suggest that it will be possible to use freely available chemoinformatics tools to aid the fight against drugs in sport, while minimising the risk of wrongfully penalising innocent athletes.
Year
DOI
Venue
2006
10.1007/11875741_17
CompLife
Keywords
Field
DocType
world anti-doping agency,cdk fingerprint,available chemoinformatics tool,chemistry development kit fingerprint,knn classifiers withk,random forest,knn classifier,available chemistry development kit,better matthews correlation coefficient,mdl drug data report,random forest classifier
Matthews correlation coefficient,Chemistry,Fingerprint,Artificial intelligence,Engineering,Chemical space,Random forest,Cheminformatics,Machine learning,False positive paradox
Conference
Volume
ISSN
ISBN
4216
0302-9743
3-540-45767-4
Citations 
PageRank 
References 
3
0.41
7
Authors
2
Name
Order
Citations
PageRank
Edward O. Cannon1252.49
John B O Mitchell238432.48