Spatially-Adaptive Multilayer Selection for GAN Inversion and Editing

Master's Thesis, Tech. Report, CMU-RI-TR-22-26, Robotics Institute, Carnegie Mellon University, May, 2022

View Publication

Abstract

Existing GAN inversion and editing methods are well suited for only a target
images that contain aligned objects with a clean background, such as portraits and animal faces, but often struggle for more difficult categories with
complex scene layouts and object occlusions, such as cars, animals, and outdoor images. We propose a new method to invert and edit such complex
images in the latent space of GANs, such as StyleGAN2. Our key idea is to
explore inversion with a collection of layers, spatially adapting the inversion
process to the difficulty of the image. We learn to predict the “invertibility”
of different image segments and project each segment into a latent layer.
Easier regions can be inverted into an earlier layer in the generator’s latent
space, while more challenging regions can be inverted into a later feature
space. Experiments show that our method obtains better inversion results
compared to the recent approaches on complex categories, while maintaining downstream editability.

BibTeX

@mastersthesis{Parmar-2022-131720,
author = {Gaurav Parmar},
title = {Spatially-Adaptive Multilayer Selection for GAN Inversion and Editing},
year = {2022},
month = {May},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-22-26},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.