Abstract | ||
---|---|---|
This paper presents a vision-based approach to recognize speech without evaluating the acoustic signals. The proposed technique combines motion features and support vector machines (SVMs) to classify utterances. Segmentation of utterances is important in a visual speech recognition system. This research proposes a video segmentation method to detect the start and end frames of isolated utterances from an image sequence. Frames that correspond to `speaking' and `silence' phases are identified based on mouth movement information. The experimental results demonstrate that the proposed visual speech recognition technique yields high accuracy in a phoneme classification task. Potential applications of such a system are, e.g., human computer interface (HCI) for mobility-impaired users, lip-reading mobile phones, in-vehicle systems, and improvement of speech-based computer control in noisy environments. |
Year | DOI | Venue |
---|---|---|
2007 | 10.1109/DICTA.2007.4426769 | DICTA |
Keywords | Field | DocType |
in-vehicle system,mouth movement,technique yields high accuracy,utterance segmentation,end frame,visual speech recognition system,proposed technique,video segmentation method,speech-based computer control,visual speech recognition,proposed visual speech recognition,acoustic signal,human computer interface,speech recognition,image segmentation,support vector machine,support vector machines,application software | Speech processing,Computer science,Utterance,Image segmentation,Artificial intelligence,Application software,Image sequence,Computer vision,Pattern recognition,Segmentation,Voice activity detection,Support vector machine,Speech recognition | Conference |
ISBN | Citations | PageRank |
0-7695-3067-2 | 5 | 0.46 |
References | Authors | |
16 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Wai Chee Yau | 1 | 40 | 4.87 |
Hans Weghorn | 2 | 203 | 56.24 |
Dinesh Kant Kumar | 3 | 168 | 28.34 |