Title
Prediction of oxidoreductase-catalyzed reactions based on atomic properties of metabolites.
Abstract
Our knowledge of metabolism is far from complete, and the gaps in our knowledge are being revealed by metabolomic detection of small-molecules not previously known to exist in cells. An important challenge is to determine the reactions in which these compounds participate, which can lead to the identification of gene products responsible for novel metabolic pathways. To address this challenge, we investigate how machine learning can be used to predict potential substrates and products of oxidoreductase-catalyzed reactions.We examined 1956 oxidation/reduction reactions in the KEGG database. The vast majority of these reactions (1626) can be divided into 12 subclasses, each of which is marked by a particular type of functional group transformation. For a given transformation, the local structures of reaction centers in substrates and products can be characterized by patterns. These patterns are not unique to reactants but are widely distributed among KEGG metabolites. To distinguish reactants from non-reactants, we trained classifiers (linear-kernel Support Vector Machines) using negative and positive examples. The input to a classifier is a set of atomic features that can be determined from the 2D chemical structure of a compound. Depending on the subclass of reaction, the accuracy of prediction for positives (negatives) is 64 to 93% (44 to 92%) when asking if a compound is a substrate and 71 to 98% (50 to 92%) when asking if a compound is a product. Sensitivity analysis reveals that this performance is robust to variations of the training data. Our results suggest that metabolic connectivity can be predicted with reasonable accuracy from the presence or absence of local structural motifs in compounds and their readily calculated atomic features.Classifiers reported here can be used freely for noncommercial purposes via a Java program available upon request.
Year
DOI
Venue
2006
10.1093/bioinformatics/btl535
Bioinformatics
Keywords
Field
DocType
atomic feature,kegg metabolites,metabolic connectivity,kegg database,important challenge,atomic property,local structural motif,oxidoreductase-catalyzed reaction,novel metabolic pathway,functional group transformation,local structure
Oxidoreductase,Catalysis,Training set,Data mining,Computer science,Metabolic pathway,Support vector machine,Metabolomics,KEGG,Bioinformatics,Classifier (linguistics)
Journal
Volume
Issue
ISSN
22
24
1367-4811
Citations 
PageRank 
References 
6
0.63
10
Authors
4
Name
Order
Citations
PageRank
Fangping Mu1464.62
Pat J Unkefer2292.38
Clifford J Unkefer3232.20
William S. Hlavacek427724.15