Title | ||
---|---|---|
Population-scale Genomic Data Augmentation Based on Conditional Generative Adversarial Networks |
Abstract | ||
---|---|---|
ABSTRACTAlthough next generation sequencing technologies have made it possible to quickly generate a large collection of sequences, current genomic data still suffer from small data sizes, imbalances, and biases due to various factors including disease rareness, test affordability, and concerns about privacy and security. In order to address these limitations of genomic data, we develop a Population-scale Genomic Data Augmentation based on Conditional Generative Adversarial Networks (PG-cGAN) to enhance the amount and diversity of genomic data by transforming samples already in the data rather than collecting new samples. Both the generator and discriminator in the PG-CGAN are stacked with convolutional layers to capture the underlying population structure. Our results for augmenting genotypes in human leukocyte antigen (HLA) regions showed that PC-cGAN can generate new genotypes with similar population structure, variant frequency distributions and LD patterns. Since the input for PC-cGAN is the original genomic data without assumptions about prior knowledge, it can be extended to enrich many other types of biomedical data and beyond. |
Year | DOI | Venue |
---|---|---|
2020 | 10.1145/3388440.3412475 | BCB |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Junjie Chen | 1 | 0 | 0.34 |
Mohammad Erfan Mowlaei | 2 | 0 | 0.34 |
Xinghua Shi | 3 | 209 | 19.00 |