Title | ||
---|---|---|
COPE: an accurate k-mer-based pair-end reads connection tool to facilitate genome assembly. |
Abstract | ||
---|---|---|
Motivation: The boost of next-generation sequencing technologies provides us with an unprecedented opportunity for elucidating genetic mysteries, yet the short-read length hinders us from better assembling the genome from scratch. New protocols now exist that can generate overlapping pair-end reads. By joining the 30 ends of each read pair, one is able to construct longer reads for assembling. However, effectively joining two overlapped pair-end reads remains a challenging task. Result: In this article, we present an efficient tool called Connecting Overlapped Pair-End (COPE) reads, to connect overlapping pair-end reads using k-mer frequencies. We evaluated our tool on 30x simulated pair-end reads from Arabidopsis thaliana with 1% base error. COPE connected over 99% of reads with 98.8% accuracy, which is, respectively, 10 and 2% higher than the recently published tool FLASH. When COPE is applied to real reads for genome assembly, the resulting contigs are found to have fewer errors and give a 14-fold improvement in the N50 measurement when compared with the contigs produced using unconnected reads. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1093/bioinformatics/bts563 | BIOINFORMATICS |
Field | DocType | Volume |
Genome,Data mining,Hybrid genome assembly,Computer science,Genomics,Contig,Contig Mapping,Bioinformatics,Sequence assembly,k-mer | Journal | 28 |
Issue | ISSN | Citations |
22 | 1367-4803 | 12 |
PageRank | References | Authors |
1.04 | 3 | 11 |
Name | Order | Citations | PageRank |
---|---|---|---|
Binghang Liu | 1 | 45 | 3.43 |
Jianying Yuan | 2 | 45 | 3.43 |
Siu-ming Yiu | 3 | 1026 | 92.90 |
Zhenyu Li | 4 | 45 | 3.43 |
Yinlong Xie | 5 | 23 | 1.67 |
Yanxiang Chen | 6 | 51 | 6.23 |
Yujian Shi | 7 | 45 | 3.43 |
Hao Zhang | 8 | 56 | 4.88 |
Yingrui Li | 9 | 554 | 72.28 |
Tak-Wah Lam | 10 | 1860 | 164.96 |
Ruibang Luo | 11 | 113 | 9.92 |