Abstract | ||
---|---|---|
Motivation: The variation graph toolkit (VG) represents genetic variation as a graph. Although each path in the graph is a potential haplotype, most paths are non-biological, unlikely recombinations of true haplotypes. Results: We augment the VG model with haplotype information to identify which paths are more likely to exist in nature. For this purpose, we develop a scalable implementation of the graph extension of the positional Burrows-Wheeler transform. We demonstrate the scalability of the new implementation by building a whole-genome index of the 5008 haplotypes of the 1000 Genomes Project, and an index of all 108 070 Trans-Omics for Precision Medicine Freeze 5 chromosome 17 haplotypes. We also develop an algorithm for simplifying variation graphs for k-mer indexing without losing any k-mers in the haplotypes. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1093/bioinformatics/btz575 | BIOINFORMATICS |
DocType | Volume | Issue |
Conference | 36 | 2 |
ISSN | Citations | PageRank |
1367-4803 | 1 | 0.35 |
References | Authors | |
0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jouni Sirén | 1 | 222 | 14.85 |
Erik Garrison | 2 | 10 | 2.57 |
adam novak | 3 | 16 | 3.74 |
Benedict Paten | 4 | 266 | 24.52 |
Richard Durbin | 5 | 6203 | 1201.66 |