I am a Ph.D. student at the Advanced Robotics Centre, National University of Singapore, advised by Professor Marcelo H. Ang, Jr. I received my B.E. in Mechanical Engineering from Huazhong University of Science and Technology. Currently, I am an intern at Microsoft Research Asia, where I am exploring Embodied AI topics under the mentorship of Dr. Jiaolong Yang.
I am interested in computer vision, generative AI, and multimodal learning. My research has focused on 3D human sensing, digital human modeling, and robotic perception, where I have developed state-of-the-art solutions for egocentric hand mesh reconstruction, dynamic scene understanding, and dexterous robotic grasping.
UniGraspTransformer is a Transformer-based network for dexterous robotic grasping that streamlines training by distilling grasp trajectories from individually trained policy networks into a single universal model. It scales effectively with up to 12 self-attention blocks, generalizes well to diverse objects and real-world settings, and outperforms UniDexGrasp++ with higher success rates across seen and unseen objects.
We presents an innovative end-to-end pipeline for synthesizing functional grasps for diverse dexterous robotic hands, integrating a diffusion model for grasp estimation with a discriminator for validating grasps based on object affordances.
Dynamically reconstructing scene point cloud by transforming NeRF generated object meshes back into workspace with tracked object pose. Scene Reconstruction module runs at 9.2 FPS and whole pipeline (including grasp estimation) runs at 2.8 FPS.
DR-Pose introduces a two-stage pipeline enhancing category-level 6D object pose estimation by first completing unseen parts of objects to guide shape prior deformation, followed by scaled registration for precise pose prediction. This method significantly improves pose estimation accuracy over existing techniques, as demonstrated on benchmark datasets.