Foundation Control Model for General Embodied Intelligence
Abstract
With the growing accessibility of humanoid hardware and rapid advances in foundation models, we are entering an era where achieving general embodied intelligence is within reach--enabling humanoid robots to perform a wide range of tasks in human-centric environments. Despite significant progress in language and vision foundation models, controlling humanoids with high degrees of freedom to perform agile, dexterous, and versatile tasks remains a challenge. In this thesis, we explore pathways toward that goal. We first design a task representation framework aimed at scalable and general humanoid whole-body control. Building on this foundation, we propose a systematic real-to-sim-to-real method for achieving agile humanoid control. Finally, we introduce a universal dynamics learning framework for general-purpose robots, and demonstrate its effectiveness on various wheeled platforms—validating the feasibility of learning a world model for humanoids in the future.
BibTeX
@mastersthesis{Xiao-2025-146534,author = {Wenli Xiao},
title = {Foundation Control Model for General Embodied Intelligence},
year = {2025},
month = {April},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-25-37},
keywords = {Robot Learning, Humanoid Robot},
}