Title
Linkage disequilibrium maps to guide contig ordering for genome assembly.
Abstract
Motivation: Efforts to establish reference genome sequences by de novo sequence assembly have to address the difficulty of linking relatively short sequence contigs to form much larger chromosome assemblies. Efficient strategies are required to span gaps and establish contig order and relative orientation. We consider here the use of linkage disequilibrium (LD) maps of sequenced contigs and the utility of LD for ordering, orienting and positioning linked sequences. LD maps are readily constructed from population data and have at least an order of magnitude higher resolution than linkage maps providing the potential to resolve difficult areas in assemblies. We empirically evaluate a linkage disequilibrium map-based method using single nucleotide polymorphism genotype data in a 216 kilobase region of human 6p21.3 from which three shorter contigs are formed. Results: LD map length is most informative about the correct order and orientation and is suggested by the shortest LD map where the residual error variance is close to one. For regions in strong LD this method may be less informative for correcting inverted contigs than for identifying correct contig orders. For positioning two contigs in linkage disequilibrium with each other the inter-contig distances may be roughly estimated by this method.
Year
DOI
Venue
2019
10.1093/bioinformatics/bty687
BIOINFORMATICS
Field
DocType
Volume
Data mining,Linkage disequilibrium,Computer science,Contig,Computational biology,Sequence assembly
Journal
35
Issue
ISSN
Citations 
4
1367-4803
0
PageRank 
References 
Authors
0.34
3
2
Name
Order
Citations
PageRank
Reuben J Pengelly121.10
Andrew Collins2161.85