Title
A new method for DNA sequencing error verification and correction via an on-disk index tree
Abstract
Existing sequencing error correction techniques demand large expensive memory space. In this work, we introduce a new disk-based sequencing error correction method to solve the problem. The key idea is to utilize a special on-disk index structure, called the BoND-tree, to store and access a large set of k-mers and their associated metadata on disk. With the BoND-tree, a set of special box queries to retrieve the relevant k-mers and their counts are efficiently processed. A comprehensive voting mechanism is adopted to determine and correct an erroneous base in a genome sequence. Experiments demonstrate that the proposed method is quite promising in verifying and correcting sequencing errors in terms of accuracy and scalability.
Year
Venue
Field
2015
BCB
Metadata,Data mining,Voting,Computer science,Error detection and correction,Whole genome sequencing,DNA sequencing,Bioinformatics,Scalability
DocType
ISBN
Citations 
Conference
978-1-4503-3853-0
0
PageRank 
References 
Authors
0.34
3
6
Name
Order
Citations
PageRank
Yarong Gu100.68
Xianying Liu200.34
Qiang Zhu339860.85
Youchao Dong400.34
C. Titus Brown513715.25
Sakti Pramanik6770204.19