Controllable Generative Adversarial Network

Similar documents
arxiv: v4 [cs.lg] 1 May 2018

arxiv: v1 [cs.cv] 5 Jul 2017

Progress on Generative Adversarial Networks

DCGANs for image super-resolution, denoising and debluring

arxiv: v1 [cs.cv] 7 Mar 2018

Generative Adversarial Nets. Priyanka Mehta Sudhanshu Srivastava

Supplementary Material: Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos

arxiv: v2 [cs.cv] 25 Jul 2018

From attribute-labels to faces: face generation using a conditional generative adversarial network

Lab meeting (Paper review session) Stacked Generative Adversarial Networks

arxiv: v1 [cs.cv] 9 Nov 2018

Introduction to Generative Adversarial Networks

Generative Adversarial Network with Spatial Attention for Face Attribute Editing

arxiv: v1 [cs.cv] 16 Nov 2017

Learning to generate with adversarial networks

arxiv: v1 [cs.cv] 27 Mar 2018

Learning Residual Images for Face Attribute Manipulation

(x,y) G(x,y) Generating a Fusion Image: One s Identity and Another s Shape

Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform. Xintao Wang Ke Yu Chao Dong Chen Change Loy

Composable Unpaired Image to Image Translation

SYNTHESIS OF IMAGES BY TWO-STAGE GENERATIVE ADVERSARIAL NETWORKS. Qiang Huang, Philip J.B. Jackson, Mark D. Plumbley, Wenwu Wang

Inverting The Generator Of A Generative Adversarial Network

Semantic Image Synthesis via Adversarial Learning

arxiv: v1 [cs.lg] 12 Jul 2018

Unsupervised Holistic Image Generation from Key Local Patches

Attribute-Guided Face Generation Using Conditional CycleGAN

arxiv: v1 [cs.lg] 6 Nov 2018

Improved Boundary Equilibrium Generative Adversarial Networks

Super-Resolving Very Low-Resolution Face Images with Supplementary Attributes

Smooth Deep Image Generator from Noises

Feature Super-Resolution: Make Machine See More Clearly

StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

Unsupervised Learning

Balanced Two-Stage Residual Networks for Image Super-Resolution

DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION

Unsupervised Image-to-Image Translation Networks

An Empirical Study of Generative Adversarial Networks for Computer Vision Tasks

arxiv: v2 [cs.cv] 2 Dec 2017

arxiv: v1 [cs.cv] 16 Jul 2017

arxiv: v1 [cs.cv] 8 May 2018

Pixel-level Generative Model

arxiv: v1 [cs.cv] 19 Nov 2018

arxiv: v1 [cs.cv] 29 Nov 2017

arxiv: v2 [cs.cv] 26 Mar 2017

Super-Resolution on Image and Video

arxiv: v3 [cs.cv] 30 Mar 2018

GENERATIVE ADVERSARIAL NETWORKS (GAN) Presented by Omer Stein and Moran Rubin

Deep Learning for Visual Manipulation and Synthesis

Cost-alleviative Learning for Deep Convolutional Neural Network-based Facial Part Labeling

DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION

Channel Locality Block: A Variant of Squeeze-and-Excitation

Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform Supplementary Material

Paired 3D Model Generation with Conditional Generative Adversarial Networks

Image Super-Resolution Using Dense Skip Connections

Fast and Accurate Image Super-Resolution Using A Combined Loss

Two Routes for Image to Image Translation: Rule based vs. Learning based. Minglun Gong, Memorial Univ. Collaboration with Mr.

arxiv: v1 [cs.cv] 6 Sep 2018

Towards Open-Set Identity Preserving Face Synthesis

arxiv: v1 [cs.cv] 28 Jun 2017

Unsupervised Image-to-Image Translation with Stacked Cycle-Consistent Adversarial Networks

Proceedings of the International MultiConference of Engineers and Computer Scientists 2018 Vol I IMECS 2018, March 14-16, 2018, Hong Kong

Shadow Detection with Conditional Generative Adversarial Networks

Unsupervised Visual Attribute Transfer with Reconfigurable Generative Adversarial Networks

Learning to Discover Cross-Domain Relations with Generative Adversarial Networks

Generative Adversarial Network

Structured GANs. Irad Peleg 1 and Lior Wolf 1,2. Abstract. 1. Introduction. 2. Symmetric GANs Related Work

3D Densely Convolutional Networks for Volumetric Segmentation. Toan Duc Bui, Jitae Shin, and Taesup Moon

Tempered Adversarial Networks

arxiv: v1 [cs.cv] 23 Nov 2017

arxiv: v1 [cs.cv] 30 Mar 2018

Generative Models II. Phillip Isola, MIT, OpenAI DLSS 7/27/18

Brain MRI super-resolution using 3D generative adversarial networks

arxiv: v2 [cs.lg] 7 Jun 2017

Multi-Agent Diverse Generative Adversarial Networks

arxiv: v2 [cs.cv] 23 Sep 2018

A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation

arxiv: v1 [stat.ml] 19 Aug 2017

arxiv: v5 [cs.cv] 25 Jul 2018

arxiv: v1 [cs.lg] 21 Dec 2018

Show, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks

Amortised MAP Inference for Image Super-resolution. Casper Kaae Sønderby, Jose Caballero, Lucas Theis, Wenzhe Shi & Ferenc Huszár ICLR 2017

Visual Recommender System with Adversarial Generator-Encoder Networks

arxiv: v1 [cs.cv] 10 Apr 2018

FACE photo-sketch synthesis refers synthesizing a face

Alternatives to Direct Supervision

arxiv: v1 [cs.cv] 16 Nov 2015

Attribute Augmented Convolutional Neural Network for Face Hallucination

arxiv: v1 [cs.cv] 27 May 2018

arxiv: v6 [cs.cv] 19 Oct 2018

Improved Face Detection and Alignment using Cascade Deep Convolutional Network

Classification-Based Supervised Hashing with Complementary Networks for Image Search

Neural Face Editing with Intrinsic Image Disentangling SUPPLEMENTARY MATERIAL

arxiv: v1 [cs.cv] 17 Nov 2016

Efficient Module Based Single Image Super Resolution for Multiple Problems

arxiv: v3 [cs.cv] 9 Aug 2017

arxiv: v1 [cs.cv] 19 Oct 2017

arxiv: v1 [cs.cv] 8 Jan 2019

arxiv: v1 [cs.cv] 19 May 2018

arxiv: v1 [cs.ne] 11 Jun 2018

GENERATIVE ADVERSARIAL NETWORKS FOR IMAGE STEGANOGRAPHY

Transcription:

Controllable Generative Adversarial Network arxiv:1708.00598v2 [cs.lg] 12 Sep 2017 Minhyeok Lee 1 and Junhee Seok 1 1 School of Electrical Engineering, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul, South Korea, 02841 September 13, 2017 Abstract Although it is recently introduced, in last few years, generative adversarial network (GAN) has been shown many promising results to generate realistic samples. However, it is hardly able to control generated samples since input variables for a generator are from a random distribution. Some attempts have been made to control generated samples from GAN, but they have not shown good performances with difficult problems. Furthermore, it is hardly possible to control the generator to concentrate on reality or distinctness. For example, with existing models, a generator for face image generation cannot be set to concentrate on one of the two objectives, i.e. generating realistic face and generating difference face according to input labels. Here, we propose controllable GAN (CGAN) in this paper. CGAN shows powerful performance to control generated samples; in addition, it can control the generator to concentrate on reality or distinctness. In this paper, CGAN is evaluated with CelebA datasets. We believe that CGAN can contribute to the research in generative neural network models. jseok14@korea.ac.kr 1

1 Introduction Generative Adversarial Network (GAN) is a neural network structure, which has been introduced for generating realistic samples. Although it has been introduced recently, GAN has shown many promising results not only for generating realistic samples [1, 2, 3], but also for machine translation [4] and image super-resolution [5]. However, for generating realistic samples, we can hardly control the GAN because a random distribution is used as input data for the sample generator. While generated samples of vanilla GAN are realistic and variational, the relationship between random input for the generator and features within generated samples are not obvious. In last few years, there have been several attempts to control generated samples by GAN [6, 7, 8, 9, 10]. One of the most popular methods to control GAN is the conditional GAN [10]. Conditional GAN inputs labels into the generator and the discriminator so that they work under the conditions. However, in conditional GAN, it has not been studied to control the generator to focus on one of the two objectives of generator, i.e. generating realistic samples and making differences between samples according to input labels. For example, CelebA dataset [11] consists of 202,559 celebrity face images labeled with 40 different features, such as wearing hat or young. While conditional GAN generates realistic samples, it only works with the easy labels, such as smiling or bangs. In order to make the generator be 2

able to work with difficult labels, such as pointy nose or arched eyebrows, we have to control the generator to more focus on generating different samples according to input labels. In this paper, we propose a novel architecture of the generative model to control generated samples, called Controllable Generative Adversarial Network(CGAN). CGAN is composed of three players, i.e. a generator/decoder, a discriminator and a classifier/encoder. The generator in CGAN plays the games with the discriminator and the classifier simultaneously in our method; the generator aims to deceive the discriminator and be classified correctly by the classifier. CGAN has two main advantages compared to existing models. First, CGAN can focus on one of the two main objectives of the conditional generative model, i.e. reality or distinctness between the conditions, by parameterizing between the loss functions. Therefore, if we want to, by sacrificing reality to distinctness, CGAN can generate face images with the difficult labels. Second, CGAN uses an independent network for mapping the features into corresponding input labels while the discriminator conducts such a work in conditional GAN. Consequently, the discriminator can more concentrate on its own objective, which is the discrimination between fake samples and original samples, so that the reality of generated samples can be enhanced. CGAN is applied to the CelebA dataset in this paper. We demonstrated that CGAN can effectively generate face image samples according to input labels. 3

2 Materials and Methods CGAN is composed of three neural network structures, which are a generator/decoder, a discriminator and a classifier/encoder. Figure 1 illustrates the architecture of CGAN. Three-player game is conducted in CGAN where the generator tries to deceive the discriminator, which is same as vanilla GAN, and aim to be classified corresponding class by the classifier. The generator and the classifier can be interpreted as a decoder-encoder structure because labels are commonly used for inputs for the generator and outputs for the classifier. Figure 1: The architecture of CGAN 4

CGAN minimizes the following equations: θ D = argmin{αl D (t D,D(x;θ D ))+(1 α)l D ((1 t D ),D(G(z,l;θ G );θ D ))}, θ C = argmin{βl C (l,x)+(1 β)l C ((1 l),g(z,l;θ G ))}, θ G = argmin{l C (l,g(z,l;θ G ))}+γargmin{l D (t D,D(G(z,l;θ G );θ D ))}, (1) where l is the binary representation of labels of sample x and input data for the generator, and and denote parameters for the discriminator and the classifier, respectively. CGAN forces features to be mapped onto corresponding l inputted into the generator. The parameter decides how much the generator focus on the reality of generated samples. For the applications in this paper, we used Boundary Equilibrium GAN (BEGAN) architecture [3],which is an up-to-date GAN architecture for generating image samples. The generator consists of four deconvolutional layers. The size of 5 5 filters are used in each layer. The discriminator consists of four convolutional layers and four deconvolutional layers. The classifier consists of four convolutional layers and one fully connected layer. In order to prove effectiveness of the proposed method under even a simple condition, dropout and max-pooling is not used. We set α to 0.5, β to 1 and γ to 5. 5

3 Results and Discussion 3.1 Generating multi-label image samples of celebrity face using CelebA dataset CGAN can generate multi-label samples by imputing multiple label into the generator. CelebA dataset contains celebrity face images with multiple labels for each image. For example, a sample can have multiple labels of Attractive, Blond Hair, Mouth Slightly Open and Smiling. First, we generate image samples with only one label. Generated samples are shown in Figure 2. As we described, CGAN can generate multi-label samples by inputting multiple labels into the generator. We generate some examples of multilabel with the generator trained by CelebA dataset. The results are shown in Figure 3. In Figure 3, the No Label image is generated by only with random distribution z N(0,1), and the input label is set to a zero vector of length 40, i.e. l = [0,0,...,0].The + Attractive image is generated by the same z and the binary representation of Attractive label. In this manner, the last image is generated by the same z and the seven different labels. As we described in the previous section, CGAN has an advantage for generating label-focused samples compared to conditional GAN. By selecting low value of γ, we can make the generator more focus on the input labels. Figure 4 shows the comparison between CGAN with γ = 5 and conditional GAN. As shown in Figure 4, generated face images by CGAN strongly follow 6

Figure 2: Generated image samples with CelebA dataset using only one label 7

Figure 3: Generated face images with multiple labels by CGAN the input labels than the conditional GAN. For example, with the Arched Eyebrows label, all image samples generated by CGAN strongly follow the label while some face images generated by conditional GAN do not follow the label. 4 Conclusion In this paper, we proposed a generative model, called CGAN, which can control generated samples. CGAN consists of three modules, i.e. a generator/decoder, a discriminator and a classifier/encoder. By mapping corresponding features into inputting labels, generated samples can be controlled effectively. While CGAN is a simple architecture, which is a combination of the 8

Figure 4: Generated face images with different input labels by CGAN vanilla GAN and a decoder-encoder structure, we demonstrated that CGAN can generate face image samples with labels. Since the proposed architecture shows powerful performance to control generated samples by the generator, we expect that CGAN can significantly improve research in generative models. References [1] H. Zhang, T. Xu, H. Li, S. Zhang, X. Huang, X. Wang, and D. Metaxas, Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks, arxiv preprint arxiv:1612.03242, 2016. 9

[2] J. Zhao, M. Mathieu, and Y. LeCun, Energy-based generative adversarial network, arxiv preprint arxiv:1609.03126, 2016. [3] D. Berthelot, T. Schumm, and L. Metz, Began: Boundary equilibrium generative adversarial networks, arxiv preprint arxiv:1703.10717, 2017. [4] Z. Yang, W. Chen, F. Wang, and B. Xu, Improving neural machine translation with conditional sequence generative adversarial nets, arxiv preprint arxiv:1703.04887, 2017. [5] C. Ledig, L. Theis, F. Huszr, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, and Z. Wang, Photo-realistic single image super-resolution using a generative adversarial network, arxiv preprint arxiv:1609.04802, 2016. [6] X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel, Infogan: Interpretable representation learning by information maximizing generative adversarial nets, in Advances in Neural Information Processing Systems, pp. 2172 2180. [7] A. van den Oord, N. Kalchbrenner, L. Espeholt, O. Vinyals, and A. Graves, Conditional image generation with pixelcnn decoders, in Advances in Neural Information Processing Systems, pp. 4790 4798. 10

[8] X. Yan, J. Yang, K. Sohn, and H. Lee, Attribute2image: Conditional image generation from visual attributes, in European Conference on Computer Vision, pp. 776 791, Springer. [9] G. Antipov, M. Baccouche, and J.-L. Dugelay, Face aging with conditional generative adversarial networks, arxiv preprint arxiv:1702.01983, 2017. [10] M. Mirza and S. Osindero, Conditional generative adversarial nets, arxiv preprint arxiv:1411.1784, 2014. [11] Z. Liu, P. Luo, X. Wang, and X. Tang, Deep learning face attributes in the wild, in Proceedings of the IEEE International Conference on Computer Vision, pp. 3730 3738. 11