Title
Zwift: A Programming Framework for High Performance Text Analytics on Compressed Data.
Abstract
Today's rapidly growing document volumes pose pressing challenges to modern document analytics frameworks, in both space usage and processing time. Recently, a promising method, called text analytics directly on compressed data (TADOC), was proposed for improving both the time and space efficiency of text analytics. The main idea of the technique is to enable direct document analytics on compressed data. This paper focuses on the programming challenges for developing efficient TADOC programs. It presents Zwift, the first programming framework for TADOC, which consists of a Domain Specific Language, a compiler and runtime, and a utility library. Experiments show that Zwift significantly improves programming productivity, while effectively unleashing the power of TADOC, producing code that reduces storage usage by 90.8% and execution time by 41.0% on six text analytics problems.
Year
DOI
Venue
2018
10.1145/3205289.3205325
ICS
Keywords
Field
DocType
Compilers, Domain Specific Languages, Text Analytics
Programming productivity,Domain-specific language,Text mining,Software engineering,Computer science,Parallel computing,Compiler,Execution time,Analytics,Software framework
Conference
ISBN
Citations 
PageRank 
978-1-4503-5783-8
2
0.38
References 
Authors
19
5
Name
Order
Citations
PageRank
Feng Zhang17914.36
Jidong Zhai234036.27
Xipeng Shen32025118.55
Onur Mutlu49446357.40
Wenguang Chen5101470.57