Abstract | ||
---|---|---|
Given the ever increasing volume of data generated by the Internet of Things, data compression plays an essential role in reducing the cost of data transmission and storage. However, it also introduces a barrier, namely decompression, between users and the data-driven insights they require. We propose methods for direct analytics of compressed data based on the Generalized Deduplication compression algorithm. When applied to data clustering, the accuracy of the best performing method differs by merely 1-5% when compared to analytics performed upon the uncompressed data. However, it runs four times faster, accesses only 14% as much data and requires significantly less storage since the data is always compressed. These results show that it is possible to simultaneously reap the benefits of compression and accurate, high-speed analytics in many applications. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1109/GLOBECOM46510.2021.9685589 | 2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM) |
Keywords | DocType | ISSN |
data mining, data compression, Internet of Things, clustering methods, explainable AI | Conference | 2334-0983 |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Aaron Hurst | 1 | 0 | 0.34 |
Qi Zhang | 2 | 13 | 7.05 |
Daniel E. Lucani | 3 | 236 | 42.29 |
Ira Assent | 4 | 1204 | 66.42 |