Getting Optimization layers to play well with Deep Networks : Numerical methods and Architectures - Robotics Institute Carnegie Mellon University
Loading Events

PhD Thesis Defense

September

16
Tue
Swaminathan Gurumurthy PhD Student Robotics Institute,
Carnegie Mellon University
Tuesday, September 16
2:00 pm to 4:00 pm
Newell-Simon Hall 4305
Getting Optimization layers to play well with Deep Networks : Numerical methods and Architectures
Abstract
Many real-world challenges, from robotic control to resource management, can be effectively formulated as optimization problems. Recent advancements have focused on incorporating these optimization problems as layers within deep learning pipelines, enabling the explicit inclusion of auxiliary constraints or cost functions, which is crucial for applications such as enforcing physical laws, ensuring safety constraints, and optimizing complex objectives. However, these layers introduce several challenges, including inference inefficiencies, unstable training dynamics, modeling inaccuracies, and representational inefficiencies, which need to be addressed to fully harness their potential.

We systematically investigate these challenges and propose novel numerical methods and architectural solutions that mitigate them, making optimization layers more efficient and effective within deep learning pipelines. Our contributions include methods for enhancing computational efficiency by exploiting the iterative nature of optimization problems, tackling issues of gradient bias and variance in high dimensional problems by exploiting parallelism and network learnt priors about the system, improving sample efficiency in reinforcement learning using approximate simulators, and mitigating representational problems with using complicated constrained optimization layers by creating a tight feedback loop between the optimizer state and the network outputs in domains like robotic control and mechanism design with LLMs. We demonstrate these contributions across different applications, ranging from input-optimization problems, 3D pose estimation and reconstruction, differentiable model predictive control and reinforcement learning problems. We also present a new approach for visual-inertial navigation in nanosatellites, highlighting the practical benefits of integrating optimization layers in challenging real-world scenarios.

Together, these contributions advance our understanding of the complexities and opportunities in integrating optimization layers within deep learning models, offering new frameworks and insights that improve efficiency, stability, and generalizability across a wide range of complex tasks.

Thesis Committee Members:
Zico Kolter, Co-chair
Zac Manchester, Co-chair
Geoffrey Gordon
Max Simchowitz
Vladlen Koltun, Apple