Title
The weighted words collector
Abstract
Motivated by applications in bioinformatics, we consider the word collector problem, i.e. the expected number of calls to a random weighted generator of words of length $n$ before the full collection is obtained. The originality of this instance of the non-uniform coupon collector lies in the, potentially large, multiplicity of the words/coupons of a given probability/composition. We obtain a general theorem that gives an asymptotic equivalent for the expected waiting time of a general version of the Coupon Collector. This theorem is especially well-suited for classes of coupons featuring high multiplicities. Its application to a given language essentially necessitates some knowledge on the number of words of a given composition/probability. We illustrate the application of our theorem, in a step-by-step fashion, on three exemplary languages, revealing asymptotic regimes in $\Theta(\mu(n)\cdot n)$ and $\Theta(\mu(n)\cdot \log n)$, where $\mu(n)$ is the sum of weights over words of length $n$.
Year
Venue
Keywords
2012
CoRR
coupon collector problem
Field
DocType
Volume
Discrete mathematics,Combinatorics,Positive real numbers,Multiplicity (mathematics),Expected value,Coupon collector's problem,Mathematics
Journal
abs/1202.0920
ISSN
Citations 
PageRank 
AOFA - 23rd International Meeting on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms - 2012 (2012) TBA
1
0.43
References 
Authors
5
3
Name
Order
Citations
PageRank
Jérémie Du Boisberranger110.77
Danièle Gardy241176.32
Yann Ponty324925.67