Title
MaNGA: a novel multi-objective multi-niche genetic algorithm for QSAR modelling.
Abstract
The Quantitative structure-activity relationship (QSAR) modelling is currently used in multiple fields to relate structural properties of compounds to their biological activities. This technique is also used for drug design purposes with the aim of predicting parameters that determine drug behaviour. To this end, a sophisticated process, involving various analytical steps concatenated in series, is employed to identify and fine-tune the optimal set of predictors from a large dataset of molecular descriptors (MDs). The search of the optimal model requires to optimize multiple objectives at the same time, as the aim is to obtain the minimal set of features that maximizes the goodness of fit and the applicability domain (AD). Hence, a multi-objective optimization strategy, improving multiple parameters in parallel, can be applied. Here we propose a new multi-niche multi-objective genetic algorithm that simultaneously enables stable feature selection as well as obtaining robust and validated regression models with maximized AD. We benchmarked our method on two simulated datasets. Moreover, we analyzed an aquatic acute toxicity dataset and compared the performances of single- and multi-objective fitness functions on different regression models. Our results show that our multi-objective algorithm is a valid alternative to classical QSAR modelling strategy, for continuous response values, since it automatically finds the model with the best compromise between statistical robustness, predictive performance, widest AD, and the smallest number of MDs. Availability and implementation The python implementation of MaNGA is available at https://github.com/Greco-Lab/MaNGA. Supplementary information Supplementary data are available at Bioinformatics online.
Year
DOI
Venue
2020
10.1093/bioinformatics/btz521
BIOINFORMATICS
Field
DocType
Volume
Quantitative structure–activity relationship,Data mining,Computer science,Niche,Genetic algorithm
Journal
36
Issue
ISSN
Citations 
1
1367-4803
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Angela Serra1176.86
Serli Önlü200.34
Paola Festa328725.32
Vittorio Fortino401.01
Dario Greco5567.88