This thesis addresses this challenge through deep abstraction learning, where robots discover relational abstractions grounded by deep neural networks to construct neuro-symbolic world models from high-dimensional and noisy observations. By ignoring task-irrelevant details, abstractions can be learned efficiently from limited experience while enabling abstract planning for long-horizon decision-making problems involving many interacting objects. The central hypothesis is that abstractions and world models should co-evolve: abstractions enable efficient reasoning and planning, while planning structure and execution failures drive the discovery of richer abstractions.
This framework consists of three key components. Deep state abstractions, represented as relational predicates, map high-dimensional observations into symbolic concepts that support reasoning and planning under noisy sensory inputs. Deep action abstractions, represented as relational option policies, capture reusable behaviors that enable robots to recover from failures and progressively acquire new abstractions from interaction. Neuro-symbolic world models describe how action abstractions transform state abstractions, enabling abstract planning that generalizes to unseen long-horizon tasks. To study these challenges, this thesis introduces benchmark suites that reveal the limitations of purely neural approaches and motivate abstraction-based world modeling.
Building on these foundations, this dissertation proposes two future directions. The first studies how pre-trained coding agents can synthesize expressive programmatic world models that support planning with recomposable tools such as abstractions and perception models. The second, RoboSymphony, studies multi-agent decision-making in which robots learn abstractions over the intentions and behaviors of other agents and humans, enabling coordination in collaborative tasks.
Together, these contributions advance a unified neuro-symbolic approach for robots that learn and plan with deep abstractions, enabling efficient specialization and decision-making in open-ended real-world environments.
Thesis Committee:
Sebastian Scherer (Chair)
Maxim Likhachev
Katia Sycara
Tom Silver, Princeton University
Leslie P. Kaelbling, Massachusetts Institute of Technology
