Shengqu Cai

Shengqu Cai 「蔡盛曲」

I'm a CS PhD student at Stanford University, advised by Prof. Gordon Wetzstein and Prof. Leonidas Guibas, affliated with Computational Imaging Lab and Geometric Computing Lab. I am partly supported by a Stanford School of Engineering Fellowship.

Before Stanford, I was a CS master student at ETH Zürich supervised by Prof. Luc Van Gool. I obtained my Bachelor degree in Computer Science with first honour from King's College London in United Kingdom, where I spent some time working on information theory.

In 2022, I spent a few wonderful months working on diffusion with Eric Chan and Songyou Peng. I started my research career back in 2021 working on NeRFs and GANs with Anton Obukhov. I consider them as mentors coming into research and who I try to learn from.

I am interested in solving graphics or inverse graphics tasks that are fundamentally ill-posed via traditional methods, slay the unslayable. I have been working primarily around neural rendering and generative models, including but not limited to diffusion models, inverse rendering, unsupervised learning methods, scene representations, etc. I like making cool theories, videos, demos and applications.

Email / CV / Google Scholar / Semantic Scholar / Github / Twitter / Linkedin

* This is me prior-COVID. Since then I gained >40 pounds and lost my cool
;(

News

2025-03: We received a NVIDIA Grant for our research!
2025-02: DSD and X-Dyna are accepted to CVPR 2025, see you in Nashville!
2024-09: CVD is accepted to NeurIPS 2024, see you in Vancouver!
2024-02: Generative Rendering is accepted to CVPR 2024, see you in Seattle!
2023-09: I joined Stanford University for PhD in Computer Science!
2023-07: DiffDreamer is accepted by ICCV 2023, looking forward to Paris!
2023-05: I graduated from ETH Zürich!
2023-01: I will be working as a research intern at Adobe this summer!
2022-03: Pix2NeRF is accepted by CVPR 2022. First submission first accept!

Publications

* indicates equal contribution

	Diffusion Self-Distillation for Zero-Shot Customized Image Generation Shengqu Cai, Eric Ryan Chan, Yunzhi Zhang, Leonidas Guibas, Jiajun Wu, Gordon Wetzstein In CVPR 2025 [Project Page][Paper][Code][Demo] Training-free customized image generation model that scales to any instance and any context.
	ByteMorph: Benchmarking Instruction-Guided Image Editing with Non-Rigid Motions Di Chang, Mingdeng Cao, Yichun Shi, Bo Liu, Shengqu Cai, Shijie Zhou, Weilin Huang, Gordon Wetzstein, Mohammad Soleymani, Peng Wang In arXiv, 2025 [Project Page] [Paper] [Code] Large-scale benchmark and baseline for instruction-guided image editing with complex non-rigid motions such as viewpoint changes, articulations, and deformations.
	ReStyle3D: Scene-level Appearance Transfer with Semantic Correspondences Liyuan Zhu, Shengqu Cai, Shengyu Huang, Gordon Wetzstein, Naji Khosravan, Iro Armeni In SIGGRAPH, 2025 [Project Page][Paper][Code] Scene-level appearance transfer from a single style image to multi-view real-world scenes with semantic correspondences.
	X-Dyna: Expressive Dynamic Human Image Animation Di Chang, Hongyi Xu, You Xie, Yipeng Gao, Zhengfei Kuang, Shengqu Cai, Chenxu Zhang, Guoxian Song, Chao Wang, Yichun Shi, Zeyuan Chen, Shijie Zhou, Linjie Luo, Gordon Wetzstein, Mohammad Soleymani In CVPR 2025 (Highlight) [Project Page][Paper][Code] Human image animation using facial expressions and body movements derived from a driving video.
	Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control Zhengfei Kuang, Shengqu Cai, Hao He, Yinghao Xu, Hongsheng Li, Leonidas Guibas, Gordon Wetzstein In NeurIPS 2024 [Project Page][Paper][Code] Multi-view/multi-trajectory generation of videos sharing the same underlying content and dynamics.
	Robust Symmetry Detection via Riemannian Langevin Dynamics Jihyeon Je, Jiayi Liu, Guandao Yang, Boyang Deng, Shengqu Cai, Gordon Wetzstein, Or Litany, Leonidas Guibas In SIGGRAPH Asia 2024 [Project Page][Paper] Render low fidelity animated mesh directly into animation using pre-trained 2D diffusion models, without the need of any further training/distillation.
	Generative Rendering: Controllable 4D-Guided Video Generation with 2D Diffusion Models Shengqu Cai, Duygu Ceylan, Matheus Gadelha, Chun-Hao Paul Huang, Tuanfeng Y. Wang, Gordon Wetzstein In CVPR 2024 [Project Page][Paper] Render low fidelity animated mesh directly into animation using pre-trained 2D diffusion models, without the need of any further training/distillation.
	DiffDreamer: Towards Consistent Unsupervised Single-view Scene Extrapolation with Conditional Diffusion Models Shengqu Cai, Eric Ryan Chan, Songyou Peng, Mohamad Shahbazi, Anton Obukhov, Luc Van Gool, Gordon Wetzstein In ICCV 2023 [Project Page][Paper][Code] A diffusion-model based unsupervised framework capable of synthesizing novel views depicting a long camera trajectory flying into an input image.
	Pix2NeRF: Unsupervised Conditional π-GAN for Single Image to Neural Radiance Fields Translation Shengqu Cai, Anton Obukhov, Dengxin Dai, Luc Van Gool In CVPR 2022 [Paper][Code] 3D-free unsupervised Single view NeRF-based novel view synthesis via conditional NeRF-GAN training and inversion.

Misc

Conference Review: CVPR, ICCV, ECCV, NeurIPS, ICLR, ICML, Eurographics, SIGGRAPH

Journal Review: IJCV, Computing Surveys