Title
Varsimlab: A Docker-based Pipeline to Automatically Synthesize Short Reads with Genomic Aberrations
Abstract
Individuals of a species have similar characteristics but they are rarely identical because of the genomic variations. One of the important genomic variations is structural variation (SV), including copy number variation (CNV), which is a result of amplifications or deletions of genomic regions. It has been shown that SV plays an important role in phenotypic diversity and evolution. A Genome encompasses other aberrations such as Single Nucleotide Polymorphism (SNP) and small insertions and deletions (Indels). Although genetic variations contribute to our uniqueness, they can comprise critical developmental genes leading to gene dosage imbalances, new genes creation, and gene structures reshaping that ultimately may result in disease. Understanding the mechanisms of structural variation formation helps us better understand human phenotypic diversity, evolution and diseases susceptibility. Computational tools have been developed for genomic variation detection using next-generation sequencing (NGS) data. However, with no prior knowledge about variants in real samples, the tools that are used for detection and analysis have been hindered by the lack of a gold standard benchmark. Some multi-variant simulators have been developed for whole genome sequencing (WGS) data such as SInC and SCNVSim. However, they are not easy to use and technical skills are required to run them. Moreover, those simulators only apply genomic variations to a reference file; and other software tools, such as ART simulator, need to be used to generate the sequenced short reads. We have developed a user-friendly automated pipeline, VarSimLab, which offers an integrated web-based suite to simulate structural variations and also to generate WGS and WES short reads. It utilizes some of the existing tools and packages them into a standard Docker image; an open source technology used to package applications and their dependencies into a standardized software container. VarSimLab automates the process of simulating tumor genotypes such as SNPs, Indels, CNVs, transition/transversion, ploidy and tumor sub-clone and generating short reads. Thanks to the Docker technology, the pipeline is platform-independent and super easy for non-technical scientists to use from a web browser. VarSimLab is designed to grow as a full suite of integrated tools to analyze genomic aberrations.
Year
DOI
Venue
2017
10.1145/3107411.3108188
BCB
Keywords
Field
DocType
Docker,Structural Variations,Copy Number Variation,Reproducibility
Genome,Structural variation,Copy-number variation,Computer science,Gene dosage,Whole genome sequencing,Software,Single-nucleotide polymorphism,Bioinformatics,Indel
Conference
ISBN
Citations 
PageRank 
978-1-4503-4722-8
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Abdelrahman Hosny121.11
Fatima Zare202.37
Sheida Nabavi3188.68