Zhutian (Skye) Yang
/ ju tin-yen young /
ztyang {at} mit {dot} edu
![]() Hello! My name is Zhutian and I'm a final-year PhD candidate in robotics at MIT. I develop algorithms for solving long-horizon manipulation problems in geometrically complex environments. I use a combination of deep learning and planning methods. I'm co-advised by Leslie Pack Kaelbling and Tomás Lozano-Pérez in the Learning and Intelligent Systems group. I'm currently a part-time research intern at Toyota Research Institute (TRI) Large Behavior Models team. I was an ex-NVIDIA intern in the Seattle Robotics Lab. I obtained bachelor's degree in information engineering and media at NTU, Singapore. |
![]() |
News:
Guiding Long-Horizon Task and Motion Planning with Vision Language Models
, Caelan Reed Garrett, Leslie Pack Kaelbling, Tomás Lozano-Pérez, and Dieter Fox
ICRA 2025; CoRL 2024 LangRob Workshop (Spotlight)
TLDR: Pretrained VLMs make mistakes in predicting robot actions when prompted with open language goals, so we use VLMs to break down long-horizon goals into subgoals, which are then solved by TAMP, in an interative replanning system. It's used to solve problems that involve interactions with 20+ objects and require 30-50 actions to complete.
Paper | Project Page | Code | Bibtex | Talk | Poster (click to show)Combining Planning and Diffusion for Mobility with Unknown Dynamics
Yajvan Ravan,
, Tao Chen, Leslie Pack Kaelbling, and Tomás Lozano-PérezIn Submission
TLDR: Rearranging large objects with unprediatble dynamics is hard because the relative pose between robot and object is changing. Diffusion policies that output global robot configurations struggle to generalize to new initial and goal conditions, or new environments and objects. So, we use motion planning to generating waypoints that guide a local diffusion policy, which is trained to achieve relative movements of the chair.
Paper | Project Page | BibtexCompositional Diffusion-Based Continuous Constraint Solvers
, Jiayuan Mao, Yilun Du, Jiajun Wu, Joshua Brett Tenenbaum, Tomás Lozano-Pérez, and Leslie Pack Kaelbling
CoRL 2023
TLDR: Multi-step manipulation problems involve a lot of collision-free, physical stability, and culture-defined spatial constraints. Conventional methods usually solve it by sampling then rejection, which is too slow. Therefore, we find global solutions by diffusion-based optimization, using diffusion models trained for each contraint type.
Paper | Project Page | Code | Bibtex | Talk | MIT News | Poster (click to show)Sequence-Based Plan Feasibility Prediction for Efficient Task and Motion Planning
, Caelan Reed Garrett, Leslie Pack Kaelbling, Tomás Lozano-Pérez, and Dieter Fox
RSS 2023
TLDR: In long-horizon mobile manipulation problems in complex environments with lots of articulated and movable obstacles, task and motion planners spend most computation on solving motion planning problems that aren't solvable. So we train a plan feasibility prediction model that quickly sort candidate plans by their likelihood of success using visual and language features of the problem, which cuts down planning time by 50 - 80 %.
🔥 We won Best Paper Runner-Up in CoRL 2022 Workshop on Learning, Perception, and Abstraction for Long-Horizon Planning
Paper | Project Page | Code | Bibtex | Talk | MIT News | Tech Crunch | Poster (click to show)Let’s Handle It: Generalizable Manipulation of Articulated Objects
, and Aidan Curtis
ICRL 2022 Workshop on Generalizable Policy Learning in the Physical World (Spotlight)
🔥 We won 2nd place in the ManiSkill Challenge 2022 Robotics Track
Paper | Poster (click to show), Patrick Henry Winston, and David Hsu
Undergraduate thesis work; Also appeared in Advances in Cognitive Systems 2019 and DSpace@MIT
Paper | Video Demo | Poster (click to show)