Abstract | ||
---|---|---|
The string graph for a collection of next-generation reads is a lossless data representation that is fundamental for de novo assemblers based on the overlap-layout-consensus paradigm. In this paper, we explore a novel approach to compute the string graph, based on the FM-index and Burrows-Wheeler Transform (BWT). We describe a simple algorithm that uses only the FM-index representation of the collection of reads to construct the string graph, without accessing the input reads. Our algorithm has been integrated into the SGA assembler as a stand-alone module to construct the string graph. The new integrated assembler has been assessed on a standard benchmark, showing that FSG is significantly faster than SGA while maintaining a moderate use of main memory, and showing practical advantages in running FSG on multiple threads. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1007/978-3-319-38782-6_3 | BIOINFORMATICS RESEARCH AND APPLICATIONS, ISBRA 2016 |
DocType | Volume | ISSN |
Conference | 9683 | 0302-9743 |
Citations | PageRank | References |
0 | 0.34 | 18 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Paola Bonizzoni | 1 | 502 | 52.23 |
Gianluca Della Vedova | 2 | 342 | 36.39 |
Yuri Pirola | 3 | 128 | 15.79 |
Marco Previtali | 4 | 22 | 5.45 |
Raffaella Rizzi | 5 | 130 | 13.58 |