Underwater 3D Visual Perception and Generation - Robotics Institute Carnegie Mellon University

Underwater 3D Visual Perception and Generation

PhD Thesis, Tech. Report, CMU-RI-TR-25-82, August, 2025

Abstract

With modern robotic technologies, seafloor imagery has become more accessible to researchers and the public. This dissertation leverages deep learning and 3D vision techniques to deliver valuable information from seafloor image observations collected by robotic platforms.

Despite the widespread use of deep learning and 3D vision algorithms across various fields, underwater imaging presents unique challenges, such as lack of annotations, color distortion, and inconsistent illumination, which limit the effectiveness of off-the-shelf algorithms. This dissertation tackles the fundamental problem of building 3D representations from raw underwater images with heavy effects from light sources and medium interference. The following algorithms are developed to achieve seafloor 3D reconstruction with photorealistic quality: (i) Unsupervised underwater caustic removal with recurrent 3D Gaussian Splatting (ii) Deep water true color restoration with neural reflectance fields (iii) Camera-light source calibration for robotic platforms (iv) Dark environment relighting with 3D Gaussian Splatting. With the large amount of seafloor data as training data collected by robots, this dissertation further investigates the use of deep generative models to generate large-scale underwater terrains with natural spatial variance in appearance. The synthesized terrain can be integrated with the learned underwater lighting effects, to present realistic novel-view rendering results.

This dissertation shows examples of how 3D computer vision and deep generative models can be combined with physical laws, statistical principles, and foundation models, to address the unique challenges in underwater robotic perception. Collectively, these contributions lay the groundwork for reconstructing hi-fidelity underwater scenes to help people better understand benthic ecosystem and generating simulation environments that help close the sim-to-real gap in underwater robot perception.

BibTeX

@phdthesis{Zhang-2025-148289,
author = {Tianyi Zhang},
title = {Underwater 3D Visual Perception and Generation},
year = {2025},
month = {August},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-25-82},
keywords = {Marine Robots; Computer Vision; 3D Reconstruction},
}