Abstract | ||
---|---|---|
ABSTRACTWe propose a real-time system for synthesizing gestures directly from speech. Our data-driven approach is based on Generative Adversarial Neural Networks to model the speech-gesture relationship. We utilize the large amount of speaker video data available online to train our 3D gesture model. Our model generates speaker-specific gestures by taking consecutive audio input chunks of two seconds in length. We animate the predicted gestures on a virtual avatar. We achieve a delay below three seconds between the time of audio input and gesture animation. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1145/3411763.3451554 | Conference on Human Factors in Computing Systems |
Keywords | DocType | ISSN |
Gestures, Animation, NUI | Conference | In CHI EA 2021. ACM, New York, NY, USA, Article 197, 1-4 |
Citations | PageRank | References |
1 | 0.40 | 0 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Manuel Rebol | 1 | 3 | 1.86 |
Christian Gütl | 2 | 228 | 34.68 |
Krzysztof Pietroszek | 3 | 250 | 22.24 |