Title
Multiple choice tries and distributed hash tables
Abstract
Tries were introduced in 1960 by Fredkin as an efficient methodfor searching and sorting digital data. Recent years have seen a resurgence of interest in tries that find applications in dynamic hashing, conflict resolution algorithms, leader election algorithms, IP addresses lookup, Lempel-Ziv compression schemes, and distributed hash tables. In some of these applications, most notably in distributed hash tables one needs to design a well balanced trie, that is, a trie with the height as close as possible to its fillup level. In this paper we consider tries built from n strings such that each string can be chosen from a pool of k strings, each of them generated by a discrete i.i.d. source. Three cases are considered: k = 2, k is large but fixed, and k ∼ clog n. The goal in each case is to obtain tries as balanced as possible. Various parameters such as height and fill-up level are analyzed. It is shown that for two-choice tries a 50% reduction in height is achieved when compared to ordinary tries. In a greedy on-line construction when the string that minimizes the depth of insertion for every pair is inserted, the height is only reduced by 25%. In order to further reduce the height by another 25%, we design a more refined on-line algorithm. The total computation time of the algorithm is O(nlog n). Furthermore, when we choose the best among k ≥ 2 strings, then for large but fixed k the height is asymptotically equal to the typical depth in a trie, a result that cannot be improved. Finally, we show that further improvement can be achieved if the number of choices for each string is proportional to log n. In this case for unbiased memoryless sources highly balanced trees can be constructed by a simple greedy algorithm for which the difference between the height and the fill-up level is bounded by a constant with high probability. This, in turn, has implications for distributed hash tables, leading to a randomized ID management algorithm in peer-to- peer networks such that, with high probability, the ratio between the maximum and the minimum load of a processor is O(1).
Year
DOI
Venue
2009
10.1002/rsa.v34:3
Symposium on Discrete Algorithms
Keywords
Field
DocType
multiple choice,conflict resolution,leader election,distributed hash table,greedy algorithm
Discrete mathematics,Online algorithm,Combinatorics,struct,Greedy algorithm,Rabin–Karp algorithm,Trie,Mathematics,Bounded function,Hash table,Computation
Journal
Volume
Issue
ISSN
34
3
1042-9832
ISBN
Citations 
PageRank 
978-0-89871-624-5
1
0.42
References 
Authors
24
4
Name
Order
Citations
PageRank
Luc Devroye19313.75
GáBor Lugosi21092195.02
Gahyun Park3194.58
Wojciech Szpankowski41557192.33