Peng Chen

I am a second-year master student at Institute of Software, Chinese Academy of Sciences, supervised by Prof. Hui Chen from ISCAS and Ming Lu from Intel Labs China. I received my B.S. in Computer Science from University of Science and Technology, Beijing in 2023 and obtained Beijing Distinguished Graduate Award and Beijing Outstanding Graduation Thesis.

I serve as a reviewer for international conferences including ICLR, ICME and ISMAR.

My research focuses on the following areas:

MLLM (Qwen-VL/LLaVA/SFT/RL): general agent, R1 reasoning, and image/video understanding;
3D Vision (3DGS/2DGS): digital humans, rendering, and reconstruction;
AIGC (Diffusion): image/animation generation;

Email / Github / Google Scholar

Research

	[arXiv Preprint, 2025] CombatVLA: An Efficient Vision-Language-Action Model for Combat Tasks in 3D Action Role-Playing Games Peng Chen, Pi Bu, Yingyao Wang, Xinyi Wang, Ziming Wang, Jie Guo, Yingxiu Zhao, Qi Zhu, Jun Song, Siran Yang, Jiamang Wang, Bo Zheng Paper / Project / Code We propose CombatVLA, the first efficient visual-language action model designed for combat tasks in 3D action role-playing games. For efficient decision making, our CombatVLA is a 3B model that processes visual inputs and outputs a sequence of actions to control the game (including keyboard and mouse operations).
	[arXiv Preprint, 2024] MixedGaussianAvatar: Realistically and Geometrically Accurate Head Avatar via Mixed 2D-3D Gaussians Peng Chen, Xiaobao Wei, Qingpo Wuwu, Xinyi Wang, Xingyu Xiao, Ming Lu Paper / Project / Code We use 2DGS to maintain the surface geometry and employ 3DGS for color correction in areas where the rendering quality of 2DGS is insufficient, reconstructing a realistically and geometrically accurate 3D head avatar.
	[arXiv preprint, 2024] GazeGaussian: High-Fidelity Gaze Redirection with 3D Gaussian Splatting Xiaobao Wei, Peng Chen, Guangyu Li, Ming Lu, Hui Chen, Feng Tian Paper / Project / Code We propose GazeGaussian, a high-fidelity gaze redirection method that uses a two-stream 3DGS model to represent the face and eye regions separately.
	[AAAI, 2025] GraphAvatar: Compact Head Avatars with GNN-Generated 3D Gaussians Xiaobao Wei, Peng Chen, Ming Lu, Hui Chen, Feng Tian Paper / Project / Code We propose GraphAvatar, a compact method using Graph Neural Networks (GNN) to generate 3D Gaussians for head avatar animation, offering superior rendering performance and minimal storage requirements.
	[NeurIPS Workshop, 2024] Can VLMs Play Action Role-Playing Games? Take Black Myth Wukong as a Study Case Peng Chen, Pi Bu, Jun Song, Yuan Gao, Bo Zheng Paper / Project We propose a novel framework named the VARP agent, which directly takes game screenshots as input and generates keyboard and mouse operations to play the ARPG.
	[IEEE ICME, 2025] DiffusionTalker: Efficient and Compact Speech-Driven 3D Talking Head via Personalizer-Guided Distillation Peng Chen, Xiaobao Wei, Ming Lu, Hui Chen, Feng Tian Paper / Project / Code We propose DiffusionTalker, a diffusion-based method that utilizes contrastive personalizer to generate personalized 3D facial animation and personalizer-guided distillation for acceleration and compression.
	[IEEE VR, 2024] Bring Your Own Character: A Holistic Solution for Automatic Facial Animation Generation of Customized Characters Zechen Bai, Peng Chen, Xiaolan Peng, Lu Liu, Naiming Yao, Hui Chen, Feng Tian Paper / Code Given a target facial video as reference, bring your own character into our solution integrated with Unity3D, it automatically generates facial animation for the virtual character.

Internships

[04/2024 - 03/2025] Alibaba, Taotian
Research intern for MLLM, focusing on MLLM-based VLA agents, including enhancing Qwen-VL series and developing advanced applications.
[11/2023 - 04/2024] AMD, Xilinx AI
Research intern for Diffusion-based AIGC, especially focused on improving ControlNet and Stable Diffusion for image generation.
[07/2023 - 08/2023] Baidu, ACG
Research intern for LLM evaluation, focusing on the automated evaluation of text-based question-answering tasks for the Wenxin large language model and reward model.

News

[06/2023] Beijing Outstanding Graduation Design (Thesis), 2023.
[06/2023] Beijing Distinguished Graduate Award, 2023.

Miscellaneous

Friends (click to expand, random order)

Last updated: Nov. 2024
Web page design credit to Jon Barron