Title
Indexing internal memory with minimal perfect hash functions
Abstract
A perfect hash function (PHF) is an injective function that maps keys from a set S to unique values, which are in turn used to index a hash table. Since no collisions occur, each key can be retrieved from the table with a single probe. A minimal perfect hash function (MPHF) is a PHF with the smallest possible range, that is, the hash table size is exactly the number of keys in S. MPHFs are widely used for memory efficient storage and fast retrieval of items from static sets. Differently from other hashing schemes, MPHFs completely avoid the problem of wasted space and wasted time to deal with collisions. In the past, the amount of space to store an MPHF description was O(log n) bits per key and therefore similar to the overhead of space of other hashing schemes. Recent results on MPHFs by [Botelho et al. 2007] changed this scenario: in their work the space overhead of an MPHF is approximately 2.6 bits per key. The objective of this paper is to show that MPHFs are a good option to index internal memory when static key sets are involved and both successful and unsuccessful searches are allowed. We have shown that MPHFs provide the best tradeoff between space usage and lookup time when compared with linear hashing, quadratic hashing, double hashing, dense hashing, cuckoo hashing and sparse hashing. For example, MPHFs outperforms linear hashing, quadratic hashing and double hashing when these methods have a hash table occupancy of 75% or higher (if the MPHF fits in the CPU cache the same happens for hash table occupancies greater than or equal to 55%). Furthermore, MPHFs also have a better performance in all measured aspects when compared to sparse hashing, which has been designed specifically for efficient memory usage.
Year
Venue
Keywords
2008
SBBD
space usage,hash table size,indexing internal memory,s. mphfs,hash table,perfect hash function,hash table occupancy,maps key,mphf description,minimal perfect hash function,space overhead,indexation,hash function
Field
DocType
Citations 
Hopscotch hashing,Double hashing,Computer science,Universal hashing,Consistent hashing,Dynamic perfect hashing,Database,Hash table,Cuckoo hashing,Linear hashing
Conference
2
PageRank 
References 
Authors
0.40
12
4
Name
Order
Citations
PageRank
Fabiano C. Botelho117411.06
Hendrickson R. Langbehn220.40
Guilherme Vale Menezes3282.00
Nivio Ziviani41598154.65