Title
Cache Friendly Burrows-Wheeler Inversion
Abstract
The Burrows-Wheeler transform permutes the symbols of a string such that the permuted string can be compressed effectively with fast, simple techniques. Inversion of the transform is a bottleneck in practice. Inversion takes linear time, but, for each symbol decoded, folklore says that a random access into the transformed string (and so a CPU cache-miss) is necessary. In this paper we show how to mitigate cache misses and so speed inversion. Our main idea is to modify the standard inversion algorithm to detect and record repeated sub strings in the original string as it is recovered. Subsequent occurrences of these repetitions are then copied in a cache friendly way from the already recovered portion of the string, short cutting a series of random accesses by the standard inversion algorithm. We show experimentally that this approach leads to faster runtimes in general, and can drastically reduce inversion time for highly repetitive data.
Year
DOI
Venue
2011
10.1109/CCP.2011.15
Data Compression, Communications and Processing
Keywords
Field
DocType
main idea,permuted string,faster runtimes,inversion time,random access,standard inversion algorithm,cache friendly burrows-wheeler inversion,linear time,sub string,cpu cache-miss,original string,dna,pattern matching,burrows wheeler transform,bwt,data compression
Bottleneck,Burrows–Wheeler transform,Inversion (meteorology),Cache,Computer science,Parallel computing,Algorithm,Theoretical computer science,Suffix array,Time complexity,Data compression,Random access
Conference
ISBN
Citations 
PageRank 
978-0-7695-4528-8
1
0.40
References 
Authors
7
2
Name
Order
Citations
PageRank
Juha Kärkkäinen1135495.20
Simon J. Puglisi2113275.14