Title
Supporting Data Uncertainty in Array Databases
Abstract
Uncertain data management has become crucial to scientific applications. Recently, array databases have gained popularity for scientific data processing due to performance benefits. In this paper, we address uncertain data management in array databases, which may involve both value uncertainty within individual tuples and position uncertainty regarding where a tuple should belong in an array given uncertain dimension attributes. Our work defines the formal semantics of array operations under both value and position uncertainty. To address the new challenge raised by position uncertainty, we propose a suite of storage and evaluation strategies for array operations, with a focus on a new scheme that bounds the overhead of querying by strategically treating tuples with large variances via replication in storage. Results from real datasets show that for common workloads, our best-performing techniques outperform alternative methods based on state-of-the-art indexes by 1.7x to 4.3x for the Subarray operation and 1 to 2 orders of magnitude for Structure-Join, at only a small storage cost.
Year
DOI
Venue
2015
10.1145/2723372.2723738
ACM SIGMOD Conference
Keywords
Field
DocType
array databases,uncertainty,store-multiple
Data mining,Data processing,Suite,Tuple,Computer science,Popularity,Uncertain data,Theoretical computer science,Database,Semantics of logic
Conference
Citations 
PageRank 
References 
2
0.37
39
Authors
2
Name
Order
Citations
PageRank
Liping Peng140.74
Yanlei Diao22234108.95