Title
Pippin: A random forest-based method for identifying presynaptic and postsynaptic neurotoxins.
Abstract
Presynaptic and postsynaptic neurotoxins are two types of neurotoxins from venomous animals and functionally important molecules in the neurosciences; however, their experimental characterization is difficult, time-consuming, and costly. Therefore, bioinformatics tools that can identify presynaptic and postsynaptic neurotoxins would be very useful for understanding their functions and mechanisms. In this study, we propose Pippin, a novel machine learning-based method that allows users to rapidly and accurately identify these two types of neurotoxins. Pippin was developed using the random forest (RF) algorithm and evaluated based on an up-to-date dataset. A variety of sequence and motif features were combined, and a two-step feature-selection algorithm was employed to characterize the optimal feature subset for presynaptic and postsynaptic neurotoxin prediction. Extensive benchmark tests illustrate that Pippin significantly improved predictive performance as compared with six other commonly used machine-learning algorithms, including the naive Bayes classifier, Multinomial Naive Bayes classifier (MNBC), AdaBoost, Bagging, K-nearest neighbors, and XGBoost. Additionally, we developed an online webserver for Pippin to facilitate public use. To the best of our knowledge, this is the first webserver for presynaptic and postsynaptic neurotoxin prediction.
Year
DOI
Venue
2020
10.1142/S0219720020500080
JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY
Keywords
DocType
Volume
Toxin prediction,sequence analysis,machine learning,random forest,feature selection
Journal
18
Issue
ISSN
Citations 
2
0219-7200
0
PageRank 
References 
Authors
0.34
0
6
Name
Order
Citations
PageRank
Pengyu Li101.01
He Zhang200.34
Xuyang Zhao300.34
Cangzhi Jia430.72
Fuyi Li59711.25
Jiangning Song637441.93