Abstract | ||
---|---|---|
The inability to effectively construct data supply chain in distributed environments is becoming one of the top concerns in big data area. Aiming at this problem, a novel method of constructing data supply chain based on layered PROV is proposed. First, to abstractly describe the data transfer processes from creation to distribution, a data provenance specification presented by W3C is used to standardize the information records of data activities within and across data platforms. Then, a distributed PROV data generation algorithm for multi-platform is designed. Further, we propose a tiered storage management of provenance based on summarization technology, which reduces the provenance records by compressing mid versions so as to realize multi-level management of PROV. In specific, we propose a hierarchical visual technique based on a layered query mechanism, which allows users to visualize data supply chain from general to detail. The experimental results show that the proposed approach can effectively improve the construction performance for data supply chain. |
Year | DOI | Venue |
---|---|---|
2017 | 10.1007/s11227-016-1838-0 | The Journal of Supercomputing |
Keywords | Field | DocType |
Data supply chain,Data platform,Provenance,PROV,Distributed environment | Automatic summarization,Distributed Computing Environment,Data transmission,Computer science,Storage management,Supply chain,Big data,Database,Test data generation,Distributed computing | Journal |
Volume | Issue | ISSN |
73 | 4 | 0920-8542 |
Citations | PageRank | References |
1 | 0.35 | 11 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Peng Li | 1 | 5 | 2.09 |
Tin Yu Wu | 2 | 52 | 12.06 |
Xinming Li | 3 | 1 | 1.37 |
Hong Luo | 4 | 335 | 31.84 |
Mohammad S. Obaidat | 5 | 99 | 17.27 |