Abstract | ||
---|---|---|
Motivated by load balance issues in parallel calculations of the phylogenetic likelihood function we address the problem of distributing divisible items to a given number of bins. The task is to balance the overall sum of (fractional) item sizes per bin, while keeping the maximum number of unique elements in any bin to a minimum. We show that this problem is NP-hard and give a polynomial time approximation algorithm that yields a solution where the sums of (possibly fractional) item sizes are balanced across bins. Moreover, the maximum number of unique elements in the bins is guaranteed to exceed the optimal solution by at most one element. We implement the algorithm in two production-level parallel codes for large-scale likelihood-based phylogenetic inference: ExaML and ExaBayes. For ExaML, we observe best-case runtime improvements of up to a factor of 5.9 compared to the previously implemented data distribution algorithms. |
Year | DOI | Venue |
---|---|---|
2014 | 10.1007/978-3-662-44753-6_16 | ALGORITHMS IN BIOINFORMATICS |
Field | DocType | Volume |
Combinatorics,Likelihood function,Phylogenetic inference,Phylogenetic tree,Bin,Load balancing (computing),Computer science,Algorithm,Distributed computing,Polynomial time approximation algorithm | Conference | 8701 |
ISSN | Citations | PageRank |
0302-9743 | 3 | 0.51 |
References | Authors | |
7 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Kassian Kobert | 1 | 38 | 4.43 |
Tomás Flouri | 2 | 59 | 6.89 |
Andre Aberer | 3 | 40 | 6.09 |
Alexandros Stamatakis | 4 | 995 | 96.27 |