10:00 am to 12:00 pm
Newell-Simon Hall 3305
Time: 10:00 a.m. (ET)
Location: NSH 3305
Zoom: Link
Type: Ph.D. Thesis Defense
Who: Brian Yang
Title: A Language-Guided Generative Data Engine for Autonomous Driving
This thesis proposes a complementary paradigm for scaling driving policies by leveraging language supervision and generative models as a unified data engine. Rather than relying solely on logged demonstrations, we use vision-language models to infer structured constraints and reward functions from high-level specifications of desired behavior. These programmatic rewards are then used to guide trajectory diffusion models, enabling the generation of diverse, physically consistent, and behaviorally targeted driving trajectories.
We first show that combining pretrained diffusion policies with reward guidance enables zero-shot instruction following, without requiring language-annotated driving data. We then demonstrate how guided diffusion can be used as a scalable offline data generator, producing counterfactual trajectories that expand coverage of rare and safety-critical scenarios. Policies distilled from this synthetic data outperform those trained purely on logged datasets and no longer require test-time guidance. Finally, we close the loop by learning reward programs automatically via vision-language feedback, enabling iterative improvement of both the data generator and the policy.
Together, this work presents a scalable framework in which language specifies what should happen, and generative models produce how it happens, yielding a practical path toward robust driving policies without exhaustive real-world data collection or high-fidelity simulation.
