Samsung AI Talk | Generating Realistic Speech-Driven Facial Animation

Our talk by Dr Stavros Petridis of Samsung AI on Generating Realistic Speech-Driven Facial Animation took place 4pm on Wednesday, 26 Feb, at the Department of Engineering, LT6.

Our next talk is on Thursday, 5 March, by FiveAI, titled Machines that see: what AI can and can’t do. Hope to see you there!

Talk description:

CUMIN are excited to have the pleasure of hosting Dr Stavros Petridis, a research scientist at the Samsung AI centre in Cambridge and a research fellow at Imperial College London.

*** ABSTRACT ***
Speech-driven facial animation is the process which uses speech signals to automatically synthesize a talking character. However, the majority of previous works has focused on generating accurate lip movements and has neglected the importance of generating facial expressions. This talk will show how realistic talking heads can be generated using a still image of a person and an audio clip containing speech. The proposed method based on generative adversarial networks can generate videos which have lip movements that are in sync with the audio and natural facial expressions such as blinks and eyebrow movements. Several applications of the proposed approach will be presented. Finally, it will be shown how this method can be used for self-supervised learning and solving the inverse problem, i.e., generating speech from video.

Stavros is a research scientist at the Samsung AI centre in Cambridge and a research fellow at the intelligent behaviour understanding group (iBUG) at Imperial College London. He studied electrical and computer engineering at the Aristotle University of Thessaloniki, Greece and completed the MSc degree in Advanced Computing at Imperial College London. He also did his Ph.D. in Computer Science at the same university. His main research interest is in audio-visual understanding of human behavior and has worked in a wide range of applications like audio-visual speech recognition, emotion recognition, speech-driven facial animation, age/gender recognition, face re-identification and nonlinguistic vocalisation (e.g. laughter) recognition.

Leave a Reply

Your email address will not be published. Required fields are marked *