Title
DataProf: Semantic Profiling for Iterative Data Cleansing and Business Rule Acquisition.
Abstract
We showcase the first semantic data profiler, DataProf. For the constraint class of interest, current profilers compute all constraints that hold on the given data set. DataProf also computes perfect sample records that together satisfy the same constraints as the given data set. Such perfect samples make it easier to spot violations of business rules, which experts can cleanse. This novel iterative process of discovery and sampling facilitates the interaction of experts with the data, and provides the key to improving data quality and business rule acquisition. The demonstration will exemplify the process on a real-world data set and the novel class of embedded uniqueness constraints. The audience will experience how DataProf guides them in cleansing data and discovering business rules.
Year
DOI
Venue
2018
10.1145/3183713.3193544
SIGMOD/PODS '18: International Conference on Management of Data Houston TX USA June, 2018
Field
DocType
ISSN
SQL,Data mining,Data cleansing,Data quality,Iterative and incremental development,Computer science,Data profiling,Missing data,Business rule,Semantic data model
Conference
0730-8078
ISBN
Citations 
PageRank 
978-1-4503-4703-7
1
0.35
References 
Authors
6
2
Name
Order
Citations
PageRank
Ziheng Wei186.92
Sebastian Link246239.59