Title
Model and Comparison of Membership Testing Approach for Massive Data.
Abstract
In the Big Data era, data sets can be so large that it has become a great challenge for many applications to efficiently test whether a given piece of data exists in a system already. It is crucial to explore a way to solve it. One feasible solution is to construct a data structure in memory to represent the massive data set. By looking at and computing over the data structure, it is possible to check if a given data is a member of the given data set. With this kind of solution, it is necessary to consider the efficiency of memory usage, etc. A number of space-efficient approaches, such as bitmap, bloom filter, what can be called \"memory-based membership testing approaches\", can provide practical implementation for this solution. However, there is not any recognized model and various theoretical performance comparisons for these approaches, resulting in the difficulty of choosing a proper approach for a specific scenario. This paper is devoted to investigate the way to compare the different performance of different memory-based membership testing approaches. Before that, a model including corresponding definitions, which can formally represent these approaches is proposed. Based on the proposed model, evaluation criteria are developed and the corresponding algorithms are articulated. Theoretical comparison on five memory-based membership testing approaches are given, which can give effective guidance for choosing an optimal approach for a specific scenario.
Year
DOI
Venue
2015
10.1109/CCBD.2015.47
CCBD
Keywords
Field
DocType
Big data, Bloom Filter, Membership testing, Evaluative Criteria
Data mining,Bloom filter,Data structure,Data set,Computer science,Membership testing,Bitmap,Big data
Conference
ISSN
Citations 
PageRank 
2378-3680
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Gansen Zhao133449.55
Aiping Li200.34
Zijing Li391.05
Chuanghui Liu400.34