Regression based landmark estimation and multi-feature fusion for visual speech recognition - Citegraph

Paper Info

Title
Regression based landmark estimation and multi-feature fusion for visual speech recognition

Abstract
Visual speech recognition also known as lipreading can improve robustness of automatic acoustic speech recognition especially under noisy environments. However, it remains a challenging topic considering the variety of speaking characteristics and confusion between visual speech features. In this paper, we propose an automatic lipreading method by using a new lip tracking method and multiple visual information fusion to tackle the problem. First, a method of face landmark estimation based on regression is employed for lip detection, based on which a geometric-based shape invariant feature (SIF) is put forward. Moreover, it can also be applied to the removal of the non-speaking utterance. Then the motion interchange patterns and spatial-temporal descriptors are also adopted to describe the lip information, where the Bayes combination strategy is applied. The proposed method is explored on three benchmark data sets: Avletters2, OuluVS and PKUVS. Experimental results demonstrate promising results and show effectiveness of the proposed approach. © 2015 IEEE.

Year	DOI	Venue
2015	10.1109/ICIP.2015.7350911	Proceedings - International Conference on Image Processing, ICIP
Keywords	Field	DocType
Visual Speech Recognition, Shape Invariant Features, Motion Interchange Patterns, Bayes Combination	Computer vision,Feature fusion,Regression,Pattern recognition,Computer science,Speech recognition,Artificial intelligence,Landmark	Conference
Volume	ISSN	Citations
2015-December	1522-4880	0
PageRank	References	Authors
0.34	10	3

Authors (3 rows)

Cited by (0 rows)

References (10 rows)

Name	Order	Citations	PageRank
Hong Liu	1	747	82.65
Xue-Wu Zhang	2	43	11.98
Wu Pingping	3	32	4.36

1