245 — Progressive Growing of GANs for Improved Quality, Stability, and Variation

Karras et al (1710.10196)

Read on 22 April 2018
#machine-learning  #deep-learning  #neural-network  #GAN  #adversarialism  #CelebA  #generative  #tensorflow  #super-resolution  #computer-vision  #resolution  #image-synthesis  #group:nvidia 

I’ve spoken before about the idea of improving resolution gradually over the course of image synthesis (#208, #190, #193) in order to improve the reproducibility of the deterministic process, and in order to stabilize the generative model (because GAN generator nets often suffer from instability when settling on a final “answer”).

In this paper, researchers at NVIDIA propose a progressive-growing technique for improving the quality and stability of GAN-based image generation: In other words, they start with a low-resolution representation of an image that is simply noise, and then gradually refine it using a first-stage process that only reduces the noisiness of the low-resolution matrix. The next step further refines this image, upscaling it to a slightly higher resolution and infilling the detail required in order to make the image fool the discriminator again. This process is repeated until the desired resolution of imagery is acquired — which can theoretically be scaled arbitrarily large, limited only by the ultimate resolution of the dataset.

This work is important because it demonstrates that a procedurally generated image, generated through subsequent layers of adversarialism, is superior to an image generated using a single “pass.”

The researchers also propose a technique for determining the validity of GAN results (since conventional methods are inadequate to describe the results; that’s why we made a GAN in the first place). This technique leverages the Wasserstein distance between the Laplacian pyramids of generated images and those of the original dataset: This metric reacts to small variations in color and texture as well as it responds to larger-scale shortcomings of generated imagery.

The code is available on GitHub here.