Abstract | ||
---|---|---|
This paper describes a low-power VLSI chip for speaker-independent 60-kWord continuous speech recognition. We implement parallel and pipelined architecture for GMM computation and Viterbi processing. It includes a 8-path Viterbi transition architecture to maximize the processing speed and adopts tri-gram language model to improve the recognition accuracy. A two-level cache architecture is implemented for the demo system. Measured results show that our implementation achieves 25% required frequency reduction (62.5 MHz) and 26% power consumption reduction (54.8mW) for 60 k-Word real-time continuous speech recognition compared to the previous work. This chip can maximally process 3.02x and 2.25x times faster than real-time at 200MHz using the bigram and trigram language models, respectively. |
Year | DOI | Venue |
---|---|---|
2014 | 10.1587/elex.10.20130787 | IEICE ELECTRONICS EXPRESS |
Keywords | Field | DocType |
speech recognition, VLSI, low-power | Computer science,Speech recognition,Very-large-scale integration | Journal |
Volume | Issue | ISSN |
11 | 2 | 1349-2543 |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Guangji He | 1 | 18 | 3.02 |
Yuki Miyamoto | 2 | 0 | 0.34 |
Kumpei Matsuda | 3 | 3 | 0.78 |
Shintaro Izumi | 4 | 82 | 31.56 |
Hiroshi Kawaguchi | 5 | 395 | 91.51 |
masahiko yoshimoto | 6 | 117 | 34.06 |