MultiTalk: Enhancing 3D Talking Head Generation
Across Languages with Multilingual Video Dataset

INTERSPEECH 24
Kim Sung-Bin1*, Lee Chae-Yeon1*, Gihun Son1*, Oh Hyun-Bin1,
JangHoon Ju2, Suekyeong Nam2, Tae-Hyun Oh1
*denotes equal contribution
1POSTECH,  2KRAFTON
Interpolate start reference image.

MultiTalk generates 3D talking head with enhanced multilingual performance

Abstract

Recent studies in speech-driven 3D talking head generation have achieved convincing results in verbal articulations. However, generating accurate lip-syncs degrades when applied to input speech in other languages, possibly due to the lack of datasets covering a broad spectrum of facial movements across languages. In this work, we introduce a novel task to generate 3D talking heads from speeches of diverse languages. We collect a new multilingual 2D video dataset comprising 423 hours of talking videos in 20 languages. Utilizing this dataset, we present a baseline model that incorporates language-specific style embeddings, enabling it to capture the unique mouth movements associated with each language. Additionally, we present a metric for assessing lip-sync accuracy in multilingual settings. We demonstrate that training a 3D talking head model with our proposed dataset significantly enhances its multilingual performance.

BibTeX

@article{sung2024Multitalk,
  title={MultiTalk: Enhancing 3D Talking Head Generation Across Languages with Multilingual Video Dataset},
  author={Sung-Bin, Kim and Chae-Yeon, Lee and Son, Gihun and Hyun-Bin, Oh and Ju, Janghoon and Nam, Suekyeong and Oh, Tae-Hyun},
  journal={arXiv preprint arXiv:2406.14272},
  year={2024}
}

Acknowledgment

This research was supported by a grant from KRAFTON AI, and also partially supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (RS-2022-II220124, Development of Artificial Intelligence Technology for Self-Improving Competency-Aware Learning Capabilities; RS-2021-II212068, Artificial Intelligence Innovation Hub; RS-2019-II191906, Artificial Intelligence Graduate School Program (POSTECH)).