Title
ACO:lossless quality score compression based on adaptive coding order
Abstract
With the rapid development of high-throughput sequencing technology, the cost of whole genome sequencing drops rapidly, which leads to an exponential growth of genome data. How to efficiently compress the DNA data generated by large-scale genome projects has become an important factor restricting the further development of the DNA sequencing industry. Although the compression of DNA bases has achieved significant improvement in recent years, the compression of quality score is still challenging. In this paper, by reinvestigating the inherent correlations between the quality score and the sequencing process, we propose a novel lossless quality score compressor based on adaptive coding order (ACO). The main objective of ACO is to traverse the quality score adaptively in the most correlative trajectory according to the sequencing process. By cooperating with the adaptive arithmetic coding and an improved in-context strategy, ACO achieves the state-of-the-art quality score compression performances with moderate complexity for the next-generation sequencing (NGS) data. The competence enables ACO to serve as a candidate tool for quality score compression, ACO has been employed by AVS(Audio Video coding Standard Workgroup of China) and is freely available at https://github.com/Yoniming/ACO.
Year
DOI
Venue
2022
10.1186/s12859-022-04712-z
BMC Bioinformatics
Keywords
DocType
Volume
High-throughput sequencing, Quality score compression, Lossless compression, Adaptive coding order
Journal
23
Issue
ISSN
Citations 
1
1471-2105
0
PageRank 
References 
Authors
0.34
7
5
Name
Order
Citations
PageRank
Yi Niu100.34
Mingming Ma200.34
Fu Li300.34
Xianming Liu446147.55
Guangming Shi52663184.81