Algorithms and Architectures for Improving Learning with Optimization Layers

SWAMINATHAN GURUMURTHY

PhD Thesis, Tech. Report, CMU-RI-TR-25-90, September, 2025

View Publication

Abstract

Many real-world problems, from robotic control to resource management, can be effectively formulated as optimization problems. Recent advancements have focused on incorporating these optimization problems as layers within deep learning pipelines, enabling the explicit inclusion of auxiliary constraints or cost functions, which is crucial for applications requiring enforcing physical laws, ensuring safety constraints, and optimizing complex objectives. However, these layers introduce several challenges, including inference inefficiencies, unstable training dynamics, modeling inaccuracies, and representational inefficiencies, which need to be addressed to fully harness their potential. We systematically investigate these challenges and propose novel numerical methods and architectural solutions that mitigate them, making optimization layers more efficient and effective within deep learning pipelines. Our contributions include methods for enhancing computational efficiency by exploiting the iterative nature of optimization problems, tackling issues of gradient bias and variance in high dimensional problems by exploiting parallelism and network learnt priors about the system, improving sample efficiency in reinforcement learning using approximate simulators, and improving representational problems with using complicated constrained optimization layers by creating a tight feedback loop between the optimizer state and the network outputs. We demonstrate these contributions across different applications, ranging from input-optimization problems, 3D pose estimation and reconstruction, differentiable model predictive control and reinforcement learning problems. We also present a new approach for visual-inertial navigation in nanosatellites, highlighting the practical benefits of integrating optimization layers in challenging real-world scenarios. Together, these contributions advance our understanding of the complexities and opportunities in integrating optimization layers within deep learning models, offering new frameworks and insights that improve efficiency, stability, and generalizability across a wide range of complex tasks.

BibTeX

@phdthesis{GURUMURTHY-2025-148956,
author = {SWAMINATHAN GURUMURTHY},
title = {Algorithms and Architectures for Improving Learning with Optimization Layers},
year = {2025},
month = {September},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-25-90},
keywords = {Equilibrium Models, Optimization, control, SLAM, visual odometry},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.