Abstract | ||
---|---|---|
Uncertain data management has become crucial in many sensing and scientific applications. As user-defined functions (UDFs) become widely used in these applications, an important task is to capture result uncertainty for queries that evaluate UDFs on uncertain data. In this work, we provide a general framework for supporting UDFs on uncertain data. Specifically, we propose a learning approach based on Gaussian processes (GPs) to compute approximate output distributions of a UDF when evaluated on uncertain input, with guaranteed error bounds. We also devise an online algorithm to compute such output distributions, which employs a suite of optimizations to improve accuracy and performance. Our evaluation using both real-world and synthetic functions shows that our proposed GP approach can outperform the state-of-the-art sampling approach with up to two orders of magnitude improvement for a variety of UDFs. |
Year | DOI | Venue |
---|---|---|
2013 | 10.14778/2536336.2536347 | PVLDB |
Keywords | Field | DocType |
important task,uncertain input,uncertain data,user-defined function,state-of-the-art sampling approach,general framework,guaranteed error bound,uncertain data management,output distribution,approximate output distribution,proposed gp approach | Online algorithm,Data mining,Suite,Computer science,Uncertain data,User-defined function,Gaussian process,Global Positioning System,Sampling (statistics) | Journal |
Volume | Issue | ISSN |
6 | 6 | 2150-8097 |
Citations | PageRank | References |
1 | 0.37 | 12 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Thanh T. L. Tran | 1 | 206 | 8.09 |
Yanlei Diao | 2 | 2234 | 108.95 |
Charles Sutton | 3 | 1723 | 107.23 |
Anna Liu | 4 | 441 | 34.75 |