Synthesizing Obama: learning lip sync from audio - Citegraph

Paper Info

Title
Synthesizing Obama: learning lip sync from audio

Abstract
Given audio of President Barack Obama, we synthesize a high quality video of him speaking with accurate lip sync, composited into a target video clip. Trained on many hours of his weekly address footage, a recurrent neural network learns the mapping from raw audio features to mouth shapes. Given the mouth shape at each time instant, we synthesize high quality mouth texture, and composite it with proper 3D pose matching to change what he appears to be saying in a target video to match the input audio track. Our approach produces photorealistic results.

Year	DOI	Venue
2017	10.1145/3072959.3073640	ACM Trans. Graph.
Keywords	Field	DocType
Audio,Face Synthesis,LSTM,RNN,Pig data. Videos,Audiovisual Speech,Uncanny Valley,Lip Sync	Computer vision,Face synthesis,Recurrent neural network,Speech recognition,Raw audio format,Obama,Artificial intelligence,Lip sync,Mathematics,Mouth shape	Journal
Volume	Issue	ISSN
36	4	0730-0301
Citations	PageRank	References
88	2.80	37
Authors
3

Authors (3 rows)

Cited by (88 rows)

References (37 rows)

Name	Order	Citations	PageRank
Supasorn Suwajanakorn	1	266	11.20
Steven M. Seitz	2	8729	495.13
Ira Kemelmacher-Shlizerman	3	710	28.03

1