Abstract:
Robots are increasingly deployed to automate tasks that are dangerous or mundane for humans such as search and rescue, mapping, and inspection in difficult environments. They rely on their perception stack, typically composed of complementary sensing modalities, to estimate their own state and the state of the environment to enable informed decision-making.
This thesis proposes algorithms that address various aspects of multimodal measurement fusion within differentiable rendering frameworks and probabilistic inference‑based state estimation. The methods are evaluated across diverse applications, including robotic manipulation, state estimation, and 3D reconstruction.
In the first part of the thesis, we develop InCOpt, a constrained inference algorithm that solves nonlinear least-squares problems online and incrementally while integrating both hard and soft constraints. We demonstrate that our solver improves accuracy without sacrificing real-time performance across different robotics applications. Next, we address the problem of learning relative error covariances in a multi-sensory state estimation setup. Specifically, we introduce a gradient‑based method that estimates well‑conditioned covariance matrices by casting the learning process as a constrained bilevel optimization problem, eliminating the need for manual sensor calibration and further improving tracking accuracy.
The second part of the thesis shifts focus towards differentiable rendering. First, we introduce NeuSIS, the first physics-based volumetric neural renderer suitable for dense 3D acoustic reconstruction, and then AONeuS, a neural rendering framework for multimodal acoustic-optical sensor fusion focused on 3D reconstruction from small-baseline sensor trajectories. Finally, we contribute to Z-Splat which uses a Gaussian representation to significantly speed up acoustic-optical 3D reconstruction.
In our proposed work, we continue our focus on differentiable rendering. First, we revisit adaptive control in Gaussian splatting through the lens of Markov Chain Monte Carlo (MCMC) to develop improved control strategies for both general Gaussian splatting and Z-splatting. Second, we explore the method of Fast Dipole Sums, a novel point- and ray-tracing-based 3D reconstruction algorithm, and its applicability to reconstruction problems involving wave-based sensors.
Thesis Committee Members:
Michael Kaess (Chair)
Ioannis Gkioulekas
Shubham Tulsiani
Nikolay Atanasov (University of California – San Diego)
