Title
Multidimensional data organization and random access in large-scale DNA storage systems
Abstract
With impressive physical density and molecular-scale coding capacity, DNA is a promising substrate for building long-lasting data archival storage systems. To retrieve data from DNA storage, recent implementations typically rely on large libraries of meticulously designed orthogonal PCR primers, which fundamentally limit the capacity and scalability of practical DNA storage. This work combines nested and semi-nested PCR to enable multidimensional data organization and random access in large DNA storage. Our strategy effectively pushes the limit of DNA storage capacity and dramatically reduces the number of orthogonal primers needed for efficient PCR random access. Our design uses only k * n primers to uniquely address nkdata-encoding oligos. The architecture inherently supports various well-defined PCR random-access patterns that can be tailored to organize and preserve the underlying DNA-encoded data structures and relations in simple database-like formats such as rows, columns, tables, and blocks of data entries. We design in silico PCR experiments of a four-dimensional DNA storage to illustrate the mechanisms of sixteen different random-access patterns each requiring no more than two PCR reactions to selectively amplify a target dataset of various sizes. To better approximate the physical system, we formulate mathematical models based on empirical distributions to analyze the effect of pipetting, PCR bias, and PCR stochasticity on the performance of multidimensional data queries from large DNA storage. (C) 2021 Elsevier B.V. All rights reserved.
Year
DOI
Venue
2021
10.1016/j.tcs.2021.09.021
THEORETICAL COMPUTER SCIENCE
Keywords
DocType
Volume
DNA storage, Hierarchical memory, Data random access, Nested PCR, Amplification bias, PCR stochasticity
Journal
894
ISSN
Citations 
PageRank 
0304-3975
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Xin Song11515.82
Shalin Shah200.34
John H. Reif300.34