Nonrigid deformation modeling and estimation from images is a technically challenging task due to its nonlinear, nonconvex and high-dimensional nature. Traditional optimization procedures often rely on good initializations and give locally optimal solutions. On the other hand, learning-based methods that directly model the relationship between deformed images and their parameters either cannot handle complicated forms of mapping, or suffer from the Nyquist Limit and the curse of dimensionality due to high degrees of freedom in the deformation space. In particular, to achieve a worst-case guarantee of ε error for a deformation with d degrees of freedom, the sample complexity required is O(1/εd).
In this thesis, a generative model for deformation is established and analyzed using a unified theoretical framework. Based on the framework, three algorithms, Data-Driven Descent, Top-down and Bottom-up Hierarchical Predictions, are designed and constructed to solve the generative model. Under Lipchitz conditions that rule out unsolvable cases (e.g., deformation of a blank image), all algorithms achieve globally optimal solutions to the specific generative model. The sample complexity of these methods is substantially lower than that of learning-based approaches, which are agnostic to deformation modeling.
To achieve global optimality guarantees with lower sample complexity, the structure embedded in the deformation model is exploited. In particular, Data-driven Descent relates two deformed images that are far away in the parameter space by compositional structures of deformation and reduces the sample complexity to O(Cdlog1/ε). Top-down Hierarchical Prediction factorizes the local deformation into patches once the global deformation has been estimated approximately and further reduce the sample complexity to O(C1d+C2log1/ε). Finally, Bottom-up Hierarchical Prediction builds representations that are invariant to local deformation. With the representations, the global deformation can be estimated independently of local deformation, reducing the sample complexity to O((C/ε)d0) with d0 << d. From the analysis, this thesis shows the connections between approaches that are traditionally considered to be of very different nature. New theoretical conjectures on approaches like Deep Learning, are also provided.
In practice, broad applications of the proposed approaches have also been demonstrated to estimate water distortion, air turbulence, cloth deformation and human pose with state-of-the-art results. Some approaches even achieve near real-time performance. Finally, application-dependent physics-based models are built with good performance in document rectification and scene depth recovery in turbulent media.