Singing Head Generation

SINGER: Vivid Audio-driven Singing Video Generation with Multi-scale Spectral Diffusion Model

The Hong Kong University of Science and Technology

[Paper] [Code] [HuggingFace]

Cross-Subject Showcases

Real-World Showcases

Longer Video Generation

Multiple Style Showcases

Sketch Showcases
Cartoon Showcases
Painting Showcases
Sculpture Showcases

BibTex

If you find this project useful, please cite our paper:
@misc{li2024singervividaudiodrivensinging,
title={SINGER: Vivid Audio-driven Singing Video Generation with Multi-scale Spectral Diffusion Model},
author={Yan Li and Ziya Zhou and Zhiqiang Wang and Wei Xue and Wenhan Luo and Yike Guo},
year={2024},
eprint={2412.03430},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2412.03430},
}