Lei Zhou

Lei Zhou

PhD Student

National University of Singapore

Research Interests

Robotic Manipulation
Vision-Language-Action Model
3D Human Sensing

About

I am a Ph.D. student at the National University of Singapore (NUS), advised by Prof. Marcelo H. Ang Jr. I received my B.E. in Mechanical Engineering from Huazhong University of Science and Technology (HUST).

My research lies at the intersection of Embodied AI, Computer Vision, and Robotics. Specifically, I am interested in leveraging large-scale human video data to train generalist robot policies, developing Vision-Language-Action (VLA) models, and exploring World Models for robotic manipulation.

Currently, I am a Research Intern at Xiaomi Technology (Embodied Intelligence Team), working closely with Dr. Long Chen. Prior to this, I was a Research Intern at Microsoft Research Asia (MSRA), where I worked closely with Dr. Jiaolong Yang and Dr. Yu Deng.

News

  • Nov 2025: Released MiMo-Embodied, a foundation model achieving SOTA on 17 embodied AI benchmarks.
  • May 2025: Joined Xiaomi Technology (Embodied Intelligence Team) as a Research Intern.
  • May 2025: Awarded the Microsoft “Stars of Tomorrow” Internship honor from MSRA.
  • Feb 2025: One paper (UniGraspTransformer) accepted to CVPR 2025.
  • Dec 2024: One paper (DexGrasp-Diffusion) accepted to ISRR 2024.
  • Jul 2024: Two papers (on Robotic Packing and 3D Affordance Keypoints) accepted to IROS 2024.
  • May 2024: Joined Microsoft Research Asia (MSRA) as a Research Intern.
  • Jan 2024: One paper (YOSO) accepted to ICRA 2024.
  • Jul 2023: One paper (DR-Pose) accepted to IROS 2023.

Selected Publications

View All →

MiMo-Embodied: X-Embodied Foundation Model Technical Report

Xiaomi Embodied Intelligence Team

Technical Report

The first open-source foundation model unifying embodied AI and autonomous driving.

UniGraspTransformer: Simplified Policy Distillation for Scalable Dexterous Robotic Grasping

Wenbo Wang, Fangyun Wei, Lei Zhou, Xi Chen, Lin Luo, Xiaohan Yi, Yizhong Zhang, Yaobo Liang, Chang Xu, Yan Lu, Jiaolong Yang, Baining Guo

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

A Transformer-based network for dexterous robotic grasping using policy distillation.

You Only Scan Once: A Dynamic Scene Reconstruction Pipeline for 6-DoF Robotic Grasping of Novel Objects

Lei Zhou, Haozhe Wang, Zhengshen Zhang, Zhiyang Liu, Francis EH Tay, Marcelo H. Ang Jr.

IEEE International Conference on Robotics and Automation (ICRA)

Dynamic scene reconstruction using NeRF for 6-DoF robotic grasping.