Title
Direct AUC optimization of regulatory motifs.
Abstract
Motivation: The discovery of transcription factor binding site (TFBS) motifs is essential for untangling the complex mechanism of genetic variation under different developmental and environmental conditions. Among the huge amount of computational approaches for de novo identification of TFBS motifs, discriminative motif learning (DML) methods have been proven to be promising for harnessing the discovery power of accumulated huge amount of high-throughput binding data. However, they have to sacrifice accuracy for speed and could fail to fully utilize the information of the input sequences. Result: we proposepropose a novel algorithm called CDAUC for optimizing DML- learned motifs based on the area under the receiver- operating characteristic curve (AUC) criterion, which has been widely used in the literature to evaluate the significance of extracted motifs. We show that when the considered AUC loss function is optimized in a coordinate-wise manner, the cost function of each resultant sub- problem is a piece- wise constant function, whose optimal value can be found exactly and efficiently. Further, a key step of each iteration of CDAUC can be efficiently solved as a computational geometry problem. Experimental results on real world high- throughput datasets illustrate that CDAUC outperforms competing methods for refining DML motifs, while being one order of magnitude faster. Meanwhile, preliminary results also show that CDAUC may also be useful for improving the interpretability of convolutional kernels generated by the emerging deep learning approaches for predicting TF sequences specificities.
Year
DOI
Venue
2017
10.1093/bioinformatics/btx255
BIOINFORMATICS
Field
DocType
Volume
Data mining,Interpretability,DNA binding site,Computer science,Computational geometry,Constant function,Artificial intelligence,Deep learning,Discriminative model
Journal
33
Issue
ISSN
Citations 
14
1367-4803
1
PageRank 
References 
Authors
0.36
16
3
Name
Order
Citations
PageRank
Lin Zhu1744.93
Hong-bo Zhang241.47
De-Shuang Huang35532357.50