Off-road Autonomous Driving via Guided Reinforcement Learning
Abstract
Off-road autonomous driving presents a complex set of challenges, including navigation through unmapped environments, variable terrain geometries, and uncertain, non-stationary dynamics. These conditions demand planning and control strategies that are both long-horizon and adaptable. Traditional Model Predictive Control (MPC) methods rely on dense sampling and precise dynamics modeling, which limits their feasibility for real-time planning in unstructured terrains. In contrast, Reinforcement Learning (RL) approaches offer fast execution but suffer from poor exploration efficiency, particularly in obstacle-dense and dynamically diverse settings.
This thesis proposes a hierarchical autonomy framework that integrates a low-frequency, long-horizon planner with a high-frequency, reactive RL-based controller. To overcome the exploration limitations of RL, the thesis introduces a novel teacher-student training paradigm. A teacher policy, trained off-policy using expert trajectories or heuristics, guides the learning process of a student policy trained on-policy. The thesis further extends the Proximal Policy Optimization (PPO) algorithm with a new hybrid policy gradient formulation that effectively leverages off-policy guidance alongside stable on-policy updates.
The proposed approach is validated in a realistic off-road simulation environment and benchmarked against standard RL and imitation learning baselines, showing improved terrain traversal and obstacle avoidance. Additionally, the trained policy is deployed on Sabrecat, a full-scale autonomous off-road ground vehicle. Experimental results demonstrate successful real-time execution, robust obstacle avoidance, and generalization to novel, complex terrains. This thesis contributes a practical and scalable solution to long-horizon off-road autonomy by combining hierarchical planning and guided reinforcement learning.
BibTeX
@mastersthesis{Mundheda-2025-148198,author = {Vedant Mundheda},
title = {Off-road Autonomous Driving via Guided Reinforcement Learning},
year = {2025},
month = {July},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-25-78},
}