Title
Graph-based approach for gene markers and applications in next-generation sequencing data analysis
Abstract
With the tremendous large data sets from next-generation sequencing technology, there is the big challenging task of how to transform the large amount of sequencing data to information meaningful for biologists and life scientists. The initial step is to align these data sets to the known reference genomes, which is essential for further data analysis. However next-generation sequencing read alignment is very time consuming. To address this challenge, we developed a general framework to build genome-wide unique gene markers. This framework includes a novel graph theoretical model for extracting robust gene markers and effective methods for evaluating the sets of gene markers. The genome-wide unique gene markers will help accelerating the next-generation sequencing read alignment process. We use the E. Coli genome as a model genome to illustrate our approach and demonstrate its significance in time saving for read alignment. Further testing on using the sets of gene markers for read coverage analysis is being conducted.
Year
DOI
Venue
2011
10.1145/2147805.2147886
BCB
Keywords
Field
DocType
tremendous large data set,next-generation sequencing,next-generation sequencing data analysis,data analysis,gene marker,sequencing data,robust gene marker,graph-based approach,e. coli genome,genome-wide unique gene marker,next-generation sequencing technology,alignment process,next generation sequencing,gene markers
Genome,Graph,Data mining,Data set,DNA sequencing theory,Alignment-free sequence analysis,Computer science,Life Scientists,DNA sequencing,Genetic marker
Conference
Citations 
PageRank 
References 
0
0.34
1
Authors
4
Name
Order
Citations
PageRank
Daniel Johnson11189.98
Kun Wang200.34
Carole L Cramer321.74
Xiuzhen Huang444226.16