Title
Learning Fuzzy Rules for Visual Speech Recognition
Abstract
We outline a method to learn fuzzy rules for visual speech recognition. Such a system could be used in automatic annotation of video sequences, to aid subsequent retrieval; it could also be used to improve the recognition of voice commands when a system has no keyboard. In the implemented system, features were extracted automatically from short video sequences, by identifying regions of the face and tracking the movement of various points around the mouth from frame to frame. The words in video sequences were segmented manually on phoneme boundaries and a rule base was constructed using two-dimensional fuzzy sets on feature and time parameters. The method was applied to the Tulips1 database and results were slightly better than those obtained with techniques based on neural networks and Hidden Markov Models. This suggests that the learned rules are speaker independent. A medium sized vocabulary of around 300 words, representative of phonemes in the English language, was created and used for training and testing. Reasonable accuracy for phoneme classification was achieved. Because of the ambiguity and similarity of various speech sounds a scheme was developed to select a group of words when a test word was presented to the system. The accuracy achieved was 21-33%, comparable to expert human lip-readers whose accuracy on nonsense words is about 30%.
Year
DOI
Venue
2003
10.1007/978-3-540-25981-7_11
ADAPTIVE MULTIMEDIA RETRIEVAL
Keywords
Field
DocType
fuzzy set,english language,hidden markov model,rule based,neural network
Similitude,Computer science,Expert system,Fuzzy logic,Phonetics,Speech recognition,Fuzzy set,Artificial neural network,Hidden Markov model,Vocabulary
Conference
Volume
ISSN
Citations 
3094
0302-9743
1
PageRank 
References 
Authors
0.37
13
3
Name
Order
Citations
PageRank
M. A. Anwar110.37
James F. Baldwin240.93
Trevor P. Martin313426.98