Title
Complex genome assembly based on long-read sequencing
Abstract
High-quality genome chromosome-scale sequences provide an important basis for genomics downstream analysis, especially the construction of haplotype-resolved and complete genomes, which plays a key role in genome annotation, mutation detection, evolutionary analysis, gene function research, comparative genomics and other aspects. However, genome-wide short-read sequencing is difficult to produce a complete genome in the face of a complex genome with high duplication and multiple heterozygosity. The emergence of long-read sequencing technology has greatly improved the integrity of complex genome assembly. We review a variety of computational methods for complex genome assembly and describe in detail the theories, innovations and shortcomings of collapsed, semi-collapsed and uncollapsed assemblers based on long reads. Among the three methods, uncollapsed assembly is the most correct and complete way to represent genomes. In addition, genome assembly is closely related to haplotype reconstruction, that is uncollapsed assembly realizes haplotype reconstruction, and haplotype reconstruction promotes uncollapsed assembly. We hope that gapless, telomere-to-telomere and accurate assembly of complex genomes can be truly routinely achieved using only a simple process or a single tool in the future.
Year
DOI
Venue
2022
10.1093/BIB/BBAC305
Briefings in Bioinformatics
Keywords
DocType
Volume
genome assembly,haplotype,long-read sequencing
Journal
23
Issue
ISSN
Citations 
5
1477-4054
0
PageRank 
References 
Authors
0.34
0
6
Name
Order
Citations
PageRank
Tianjiao Zhang100.68
Jie Zhou22103190.17
Wentao Gao300.34
Yuran Jia400.68
Yanan Wei500.34
Guohua Wang623517.66