Family reunion via error correction: an efficient analysis of duplex sequencing data. - Citegraph

Paper Info

Title
Family reunion via error correction: an efficient analysis of duplex sequencing data.

Abstract
Duplex sequencing is the most accurate approach for identification of sequence variants present at very low frequencies. Its power comes from pooling together multiple descendants of both strands of original DNA molecules, which allows distinguishing true nucleotide substitutions from PCR amplification and sequencing artifacts. This strategy comes at a cost—sequencing the same molecule multiple times increases dynamic range but significantly diminishes coverage, making whole genome duplex sequencing prohibitively expensive. Furthermore, every duplex experiment produces a substantial proportion of singleton reads that cannot be used in the analysis and are thrown away. In this paper we demonstrate that a significant fraction of these reads contains PCR or sequencing errors within duplex tags. Correction of such errors allows “reuniting” these reads with their respective families increasing the output of the method and making it more cost effective. We combine an error correction strategy with a number of algorithmic improvements in a new version of the duplex analysis software, Du Novo 2.0. It is written in Python, C, AWK, and Bash. It is open source and readily available through Galaxy, Bioconda, and Github: https://github.com/galaxyproject/dunovo.

Year	DOI	Venue
2020	10.1186/s12859-020-3419-8	BMC Bioinformatics
Keywords	DocType	Volume
Duplex sequence, Low frequency variants, Barcodes, Error correction	Journal	21
Issue	ISSN	Citations
1	1471-2105	0
PageRank	References	Authors
0.34	0	8

Authors (8 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Nicholas Stoler	1	0	0.34
Barbara Arbeithuber	2	0	0.34
Gundula Povysil	3	0	0.34
Monika Heinzl	4	0	0.34
Renato Salazar	5	0	0.34
Kateryna D Makova	6	32	3.38
Irene Tiemann-Boege	7	0	0.34
Anton Nekrutenko	8	2	1.49

1