Abstract | ||
---|---|---|
Composition of weighted transducers is a fundamental algorithm used in many applications, including for computing complex edit-distances between automata, or string kernels in machine learning, or to combine different components of a speech recognition, speech synthesis, or information extraction system. We present a generalization of the composition of weighted transducers, n-w ay composition, which is dramatically faster in practice than the standard composition algorithm when combining more than two transducers. The worst-case complexity of our algorithm for composing three transducers T(1), T(2), and T(3) resulting in T, is O(vertical bar T vertical bar(Q) min(d(T(1))d(T(3)), d(T(2))) + vertical bar T vertical bar(E)), where vertical bar.vertical bar(Q) denotes the number of states, vertical bar.vertical bar(E) the number of transitions, and d(.) the maximum out-degree. As in regular composition, the use of perfect hashing requires a pre-processing step with linear-time expected complexity in the size of the input transducers. In many cases, this approach significantly improves on the complexity of standard composition. Our algorithm also leads to a dramatically faster composition in practice. Furthermore, standard composition can be obtained as a special case of our algorithm. We report the results of several experiments demonstrating this improvement. These theoretical and empirical improvements significantly enhance performance in the applications already mentioned. |
Year | DOI | Venue |
---|---|---|
2009 | 10.1142/S0129054109006772 | INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE |
Keywords | Field | DocType |
speech recognition,edit distance,machine learning,information extraction,computational complexity,string kernel | Transducer,Discrete mathematics,Speech synthesis,Combinatorics,Automaton,Algorithm,Finite state,Information extraction,Perfect hash function,Mathematics,Special case | Journal |
Volume | Issue | ISSN |
20 | 4 | 0129-0541 |
Citations | PageRank | References |
6 | 0.52 | 6 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Cyril Allauzen | 1 | 690 | 47.64 |
Mehryar Mohri | 2 | 4502 | 448.21 |