Title
Graph Data Modelling for Genomic Variants.
Abstract
Genome variant analysis is performed on Variant Call Format (VCF) files. It can take days to process these files for genome analytics due to challenges such as loading the files for each user query and processing them to answer questions of interest. As data sizes grow, timely processing of this data is putting enormous pressure on the computational resources, leading to significant processing delays and may jeopardise the ultimate goal of bringing the genomic discoveries to masses. We believe this problem will not be solved until the underlying data structure to organise and process these files undergoes a transformation. To overcome this problem, we have proposed a graph based system to represent the data in VCF files. This allows the data to be loaded once in a graph model which is then subsequently queried and processed numerous times without any additional computational and data access penalties. This helps reduce data access time by giving a constant time access to any node and addresses performance and scalability challenges that have been a limiting factor for the mass scale adoption of genome analytics. It takes only 2ms to access any data node in our graph model and remains constant for any number of nodes.
Year
DOI
Venue
2019
10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00282
SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
0
2
Name
Order
Citations
PageRank
Sanna Aizad100.34
Ashiq Anjum233338.33