Visual units and confusion modelling for automatic lip-reading. - Citegraph

Paper Info

Title
Visual units and confusion modelling for automatic lip-reading.

Abstract
Automatic lip-reading (ALR) is a challenging task because the visual speech signal is known to be missing some important information, such as voicing. We propose an approach to ALR that acknowledges that this information is missing but assumes that it is substituted or deleted in a systematic way that can be modelled. We describe a system that learns such a model and then incorporates it into decoding, which is realised as a cascade of weighted finite-state transducers. Our results show a small but statistically significant improvement in recognition accuracy. We also investigate the issue of suitable visual units for ALR, and show that visemes are sub-optimal, not but because they introduce lexical ambiguity, but because the reduction in modelling units entailed by their use reduces accuracy. A novel technique for automatic lip-reading is proposed.A weighted finite state transducer cascade is used incorporating a confusion model.Performance was slightly better than a standard HMM system.The issue of suitable units for automatic lip-reading was also studied.It was found that visemes are sub-optimal because of reduced contextual modelling.

Year	DOI	Venue
2016	10.1016/j.imavis.2016.03.003	Image Vision Comput.
Keywords	Field	DocType
Lip-reading,Speech recognition,Visemes,Weighted finite state transducers,Confusion matrices,Confusion modelling	Confusion,Pattern recognition,Viseme,Computer science,Weighted finite state transducer,Speech recognition,Cascade,Voice,Artificial intelligence,Decoding methods,Hidden Markov model,Ambiguity	Journal
Volume	Issue	ISSN
51	C	0262-8856
Citations	PageRank	References
1	0.37	16
Authors
3

Authors (3 rows)

Cited by (1 rows)

References (16 rows)

Name	Order	Citations	PageRank
Dominic Howell	1	1	0.37
Stephen J. Cox	2	148	21.98
Barry-John Theobald	3	332	25.39

1