Abstract | ||
---|---|---|
The development of a continuous visual speech recognizer for a silent speech interface has been investigated using a visual speech corpus of ultrasound and video images of the tongue and lips. By using high-speed visual data and tied-state cross-word triphone HMMs, and including syntactic information via domain-specific language models, word-level recognition accuracy as high as 72% was achieved on visual speech. Using the Julius system, it was also found that the recognition should be possible in nearly real-time. |
Year | Venue | Field |
---|---|---|
2011 | ICPhS | Speech corpus,Triphone,Speech processing,Computer science,Speech recognition,Silent speech interface,Syntax,Speech production,Language model |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
0 | 8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jun Cai | 1 | 373 | 39.29 |
Thomas Hueber | 2 | 0 | 0.68 |
B. Denby | 3 | 268 | 26.69 |
Elie-Laurent Benaroya | 4 | 0 | 0.34 |
Gérard Chollet | 5 | 725 | 129.74 |
pierre roussel | 6 | 2 | 2.06 |
Gérard Dreyfus | 7 | 475 | 58.97 |
Lise Crevier-Buchman | 8 | 12 | 6.08 |