Title
Mapping RNA-seq Data to a Transcript Graph via Approximate Pattern Matching to a Hypertext.
Abstract
Graphs are the most suited data structure to summarize the transcript isoforms produced by a gene. Such graphs may be modeled by the notion of hypertext, that is a graph where nodes are texts representing the exons of the gene and edges connect consecutive exons of a transcript. Mapping reads obtained by deep transcriptome sequencing to such graphs is crucial to compare reads with an annotation of transcript isoforms and to infer novel events due to alternative splicing at the exonic level. In this paper, we propose an algorithm based on Maximal Exact Matches that efficiently solves the approximate pattern matching of a pattern P to a hypertext H. We implement it into Splicing Graph ALigner (SGAL), a tool that performs an accurate mapping of RNA-seq reads against a graph that is a representation of annotated and potentially new transcripts of a gene. Moreover, we performed an experimental analysis to compare SGAL to a state-of-art tool for spliced alignment (STAR), and to identify novel putative alternative splicing events such as exon skipping directly from mapping reads to the graph. Such analysis shows that our tool is able to perform accurate mapping of reads to exons, with good time and space performance. The software is freely available at https://github.com/AlgoLab/galig.
Year
DOI
Venue
2017
10.1007/978-3-319-58163-7_3
ALGORITHMS FOR COMPUTATIONAL BIOLOGY (ALCOB 2017)
Keywords
DocType
Volume
Approximate sequence analysis,Next-generation sequencing,Alternative splicing,Graph-based alignment
Conference
10252
ISSN
Citations 
PageRank 
0302-9743
0
0.34
References 
Authors
0
5
Name
Order
Citations
PageRank
Stefano Beretta1279.68
Paola Bonizzoni250252.23
Luca Denti311.70
Marco Previtali4225.45
Raffaella Rizzi513013.58