Title
Predicate Invention Based RDF Data Compression.
Abstract
RDF is a data representation format for schema-free structured information that is gaining speed in the context of semantic web, life science, and vice versa. With the continuing proliferation of structured data, demand for RDF compression is becoming increasingly important. In this study, we introduce a novel lossless compression technique for RDF datasets (triples), called PIC (Predicate Invention based Compression). By generating informative predicates and constructing effective mapping to original predicates, PIC only needs to store dramatically reduced number of triples with the newly created predicates, and restoring the original triples efficiently using the mapping. These predicates are automatically generated by a decomposable forward-backward procedure, which consequently supports very fast parallel bit computation. As a semantic compression method for structured data, besides the reduction of syntactic verbosity and data redundancy, we also invoke semantics in the RDF datasets. Experiments on various datasets show competitive results in terms of compression ratio.
Year
Venue
Field
2018
JIST
External Data Representation,Computer science,Semantic Web,Theoretical computer science,Data redundancy,Data compression,Data model,RDF,Semantic compression,Lossless compression
DocType
Citations 
PageRank 
Conference
1
0.35
References 
Authors
7
6
Name
Order
Citations
PageRank
Man Zhu191.95
Weixin Wu231.43
Jeff Z. Pan32218158.01
Jingyu Han4164.67
Pengfei Huang511.36
Qian Liu6191.30