Title
Efficient Compression Algorithm for Multimedia Data
Abstract
In this work, we consider the problem of Cosine Similarity preserving dimensionality reduction (compression) for the sparse binary dataset. [18] suggested a compression algorithm for high dimensional, sparse, binary data for preserving Inner product and Hamming distance. In this work, we show that their proposed algorithm also works well for Cosine Similarity. We present a theoretical analysis of the dimension reduction bound and complement it with rigorous experimentation on real-world datasets. We compare our results with the state-of-the-art for the considered problem - SimHash [8], MinHash [21], Circulant Binary Embedding [25], and Densified one Permutation Hashing [20], and show that our result offers a significant saving in the compression time and the number of random bits required for the compression, and simultaneously provides comparable performance.
Year
DOI
Venue
2020
10.1109/BigMM50055.2020.00042
2020 IEEE Sixth International Conference on Multimedia Big Data (BigMM)
Keywords
DocType
ISBN
Cosine Similarity,Simhash,Minhash,Jaccard Similarity.
Conference
978-1-7281-9326-7
Citations 
PageRank 
References 
0
0.34
12
Authors
4
Name
Order
Citations
PageRank
Rameshwar Pratap164.50
Karthik Revanuru200.34
Anirudh Ravi300.34
Raghav Kulkarni417219.48