Integrating Reinforcement Learning and Model Predictive Control for Autonomous Off-road Driving - Robotics Institute Carnegie Mellon University
Loading Events

PhD Speaking Qualifier

April

29
Wed
Anoushka Alavilli PhD Student Robotics Institute,
Carnegie Mellon University
Wednesday, April 29
12:00 pm to 1:30 pm
Gates Hillman Center 6115
Integrating Reinforcement Learning and Model Predictive Control for Autonomous Off-road Driving
Abstract: Safe and effective autonomous traversal of off-road terrain is challenging due to both terrain properties, such as low traction in sand or deformability of mud, and terrain geometries, including steep slopes, ditches, and uneven surfaces that can induce unsafe vehicle behaviors like excessive pitch and roll. Model Predictive Path Integral (MPPI) control provides a powerful framework for solving Model Predictive Control (MPC) problems and has demonstrated strong performance in off-road and agile locomotion tasks. Key to MPPI’s success are the parallelizable open-loop dynamics rollouts, the optimization costs for which are generally determined from a learned or predefined cost map of the terrain. While MPPI can be implemented in a purely physics-based formulation, there is growing interest in integrating data-driven methods to improve performance and adaptability.

In this talk, we present Value Function-Guided MPPI, a hierarchical framework that integrates reinforcement learning (RL) with MPPI. Rather than relying on an explicit cost map, the planner uses a learned RL value function as an implicit, execution-aware objective, allowing trajectories to be evaluated based on the expected performance of the controller. MPPI performs trajectory exploration, while a pretrained RL policy executes the selected plans, creating a bidirectional feedback loop between planning and control. This removes the need for manual cost design and better aligns planning with execution dynamics. We evaluate the method in simulation and real-world field tests on a full-size Yamaha ATV, showing improved safety and goal-reaching performance in challenging terrain, and discuss sim-to-real challenges encountered.

Finally, we outline ongoing and future work that draws upon a growing body of literature reframing MPC itself as a learning problem, using RL to optimize parameters of the MPC formulation. Across the approaches discussed in this talk, we highlight key tradeoffs and design choices when integrating RL and MPC, such as sim-to-real transfer, robustness, and adaptability, thereby informing how algorithms should allocate responsibility between learning and control in real-world autonomous systems.

 
Committee:
Jeff Schneider (co-chair)
Guanya Shi (co-chair)
Wenshan Wang
Sam Triest