NASCUP: Nucleic Acid Sequence Classification by Universal Probability - Citegraph

Paper Info

Title
NASCUP: Nucleic Acid Sequence Classification by Universal Probability

Abstract
Motivated by the need for fast and accurate classification of unlabeled nucleotide sequences on a large scale, we propose a new classification method that captures the probabilistic structure of a sequence family as a compact context-tree model and uses it efficiently to test proximity and membership of a query sequence. The proposed nucleic acid sequence classification by universal probability (NASCUP) method crucially utilizes the notion of universal probability from information theory in model-building and classification processes, delivering BLAST-like accuracy in orders-of-magnitude reduced runtime for large-scale databases. A comprehensive experimental study involving seven public databases for functional non-coding RNA classification and microbial taxonomy classification demonstrates the advantages of NASCUP over widely-used alternatives in efficiency, accuracy, and scalability across all datasets considered. [availability: http://data.snu.ac.kr/nascup]

Year	Venue	Field
2015	CoRR	Information theory,Data mining,Anomaly detection,Computer science,Nucleic acid sequence,Bioinformatics
DocType	Volume	Citations
Journal	abs/1511.04944	0
PageRank	References	Authors
0.34	1	5

Authors (5 rows)

Cited by (0 rows)

References (1 rows)

Name	Order	Citations	PageRank
Sunyoung Kwon	1	9	2.31
gyuwan kim	2	1	2.04
Byunghan Lee	3	110	7.98
Sungroh Yoon	4	566	78.80
Young-Han Kim	5	318	48.11

1