Persistent Data Sketching - Citegraph

Paper Info

Title
Persistent Data Sketching

Abstract
A persistent data structure, also known as a multiversion data structure in the database literature, is a data structure that preserves all its previous versions as it is updated over time. Every update (inserting, deleting, or changing a data record) to the data structure creates a new version, while all the versions are kept in the data structure so that any previous version can still be queried. Persistent data structures aim at recording all versions accurately, which results in a space requirement that is at least linear to the number of updates. In many of today's big data applications, in particular for high-speed streaming data, the volume and velocity of the data are so high that we cannot afford to store everything. Therefore, streaming algorithms have received a lot of attention in the research community, which use only sublinear space by sacrificing slightly on accuracy. All streaming algorithms work by maintaining a small data structure in memory, which is usually called a em sketch, summary, or synopsis. The sketch is updated upon the arrival of every element in the stream, thus is ephemeral, meaning that it can only answer queries about the current status of the stream. In this paper, we aim at designing persistent sketches, thereby giving streaming algorithms the ability to answer queries about the stream at any prior time.

Year	DOI	Venue
2015	10.1145/2723372.2749443	ACM SIGMOD Conference
Field	DocType	Citations
Sublinear function,Data mining,Persistent data structure,Data structure,Small data,Streaming algorithm,Computer science,Dynamic data,Big data,Database,Sketch	Conference	10
PageRank	References	Authors
0.59	30	5

Authors (5 rows)

Cited by (10 rows)

References (30 rows)

Name	Order	Citations	PageRank
Zhewei Wei	1	215	20.07
Ge Luo	2	35	1.95
Ke Yi	3	1659	77.79
Xiaoyong Du	4	882	123.29
Ji-Rong Wen	5	4431	265.98

1