Towards View-Independent Viseme Recognition Based On Cnns And Synthetic Data - Citegraph

Paper Info

Title
Towards View-Independent Viseme Recognition Based On Cnns And Synthetic Data

Abstract
Visual Speech Recognition is the ability to interpret spoken text using video information only. To address such task automatically, recent works have employed Deep Learning and obtained high accuracy on the recognition of words and sentences uttered in controlled environments, with limited head-pose variation. However, the accuracy drops for multi-view datasets and when it comes to interpreting isolated mouth shapes, such as visemes, the values reported are considerably lower, as shorter segments of speech lack temporal and contextual information. In this work, we evaluate the applicability of synthetic datasets for assisting recognition of visemes in real-world data acquired under controlled and uncontrolled environments, using GRID and AVICAR datasets, respectively. We create two large-scale synthetic 2D datasets based on realistic 3D facial models - with near-frontal and multi-view mouth images. We perform experiments that indicate that a transfer learning approach using synthetic data can get higher accuracy than training from scratch using real data only, on both scenarios.

Year	Venue	Keywords
2018	2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP)	Image recognition, Speech recognition, Computer graphics, Machine learning
Field	DocType	ISSN
Pattern recognition,Task analysis,Viseme,Computer science,Transfer of learning,Synthetic data,Solid modeling,Artificial intelligence,Deep learning,Hidden Markov model,Grid	Conference	1522-4880
Citations	PageRank	References
0	0.34	0
Authors
3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Andréa Britto Mattos	1	12	5.58
Dário Augusto Borges Oliveira	2	7	5.21
Edmilson Da Silva Morais	3	0	2.03

1