Abstract | ||
---|---|---|
RDF is a data representation format for schema-free structured information that is gaining speed in the context of semantic web, life science, and vice versa. With the continuing proliferation of structured data, demand for RDF compression is becoming increasingly important. In this study, we introduce a novel lossless compression technique for RDF datasets (triples), called PIC (Predicate Invention based Compression). By generating informative predicates and constructing effective mapping to original predicates, PIC only needs to store dramatically reduced number of triples with the newly created predicates, and restoring the original triples efficiently using the mapping. These predicates are automatically generated by a decomposable forward-backward procedure, which consequently supports very fast parallel bit computation. As a semantic compression method for structured data, besides the reduction of syntactic verbosity and data redundancy, we also invoke semantics in the RDF datasets. Experiments on various datasets show competitive results in terms of compression ratio. |
Year | Venue | Field |
---|---|---|
2018 | JIST | External Data Representation,Computer science,Semantic Web,Theoretical computer science,Data redundancy,Data compression,Data model,RDF,Semantic compression,Lossless compression |
DocType | Citations | PageRank |
Conference | 1 | 0.35 |
References | Authors | |
7 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Man Zhu | 1 | 9 | 1.95 |
Weixin Wu | 2 | 3 | 1.43 |
Jeff Z. Pan | 3 | 2218 | 158.01 |
Jingyu Han | 4 | 16 | 4.67 |
Pengfei Huang | 5 | 1 | 1.36 |
Qian Liu | 6 | 19 | 1.30 |