Abstract | ||
---|---|---|
aSummary: The ability to generate high-quality genome sequences is cornerstone to modern biological research. Even with recent advancements in sequencing technologies, many genome assemblies are still not achieving reference-grade. Here, we introduce ntJoin, a tool that leverages structural synteny between a draft assembly and reference sequence(s) to contiguate and correct the former with respect to the latter. Instead of alignments, ntJoin uses a lightweight mapping approach based on a graph data structure generated from ordered minimizer sketches. The tool can be used in a variety of different applications, including improving a draft assembly with a reference grade genome, a short-read assembly with a draft long-read assembly and a draft assembly with an assembly from a closely related species. When scaffolding a human short-read assembly using the reference human genome or a long-read assembly, ntJoin improves the NGA50 length 23- and 13-fold, respectively, in under 13 m, using <11 GB of RAM. Compared to existing reference-guided scaffolders, ntJoin generates highly contiguous assemblies faster and using less memory. |
Year | DOI | Venue |
---|---|---|
2020 | 10.1093/bioinformatics/btaa253 | BIOINFORMATICS |
DocType | Volume | Issue |
Journal | 36 | 12 |
ISSN | Citations | PageRank |
1367-4803 | 0 | 0.34 |
References | Authors | |
0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Lauren Coombe | 1 | 3 | 1.80 |
Vladimir Nikolić | 2 | 0 | 1.01 |
Justin Chu | 3 | 11 | 4.70 |
Inanç Birol | 4 | 4 | 3.83 |
René L. Warren | 5 | 95 | 25.03 |