Lab meeting (Paper review session) Stacked Generative Adversarial Networks

Similar documents
Progress on Generative Adversarial Networks

GAN Related Works. CVPR 2018 & Selective Works in ICML and NIPS. Zhifei Zhang

GAN Frontiers/Related Methods

arxiv: v1 [cs.cv] 5 Jul 2017

Alternatives to Direct Supervision

(University Improving of Montreal) Generative Adversarial Networks with Denoising Feature Matching / 17

Generative Modeling with Convolutional Neural Networks. Denis Dus Data Scientist at InData Labs

Unsupervised Learning

arxiv: v1 [cs.cv] 8 Jan 2019

Introduction to Generative Adversarial Networks

Generative Adversarial Text to Image Synthesis

Controllable Generative Adversarial Network

arxiv: v1 [cs.cv] 7 Mar 2018

arxiv: v1 [cs.cv] 17 Nov 2016

Autoencoder. Representation learning (related to dictionary learning) Both the input and the output are x

GENERATIVE ADVERSARIAL NETWORKS (GAN) Presented by Omer Stein and Moran Rubin

CS230: Lecture 4 Attacking Networks with Adversarial Examples - Generative Adversarial Networks

Introduction to Generative Adversarial Networks

arxiv: v1 [cs.cv] 6 Sep 2018

StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

SYNTHESIS OF IMAGES BY TWO-STAGE GENERATIVE ADVERSARIAL NETWORKS. Qiang Huang, Philip J.B. Jackson, Mark D. Plumbley, Wenwu Wang

Generative Adversarial Networks (GANs) Ian Goodfellow, Research Scientist MLSLP Keynote, San Francisco

GANs for Exploiting Unlabeled Data. Presented by: Uriya Pesso Nimrod Gilboa Markevich

Introduction to GAN. Generative Adversarial Networks. Junheng(Jeff) Hao

Learning Photographic Image Synthesis With Cascaded Refinement Networks. Jonathan Louie Huy Doan Siavash Motalebi

Convolutional Neural Networks + Neural Style Transfer. Justin Johnson 2/1/2017

Inverting The Generator Of A Generative Adversarial Network

An Empirical Study of Generative Adversarial Networks for Computer Vision Tasks

Implicit generative models: dual vs. primal approaches

One Network to Solve Them All Solving Linear Inverse Problems using Deep Projection Models

Generative Adversarial Network

Supplementary Material: Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos

Deep Learning for Visual Manipulation and Synthesis

arxiv: v4 [cs.lg] 1 May 2018

Autoencoders. Stephen Scott. Introduction. Basic Idea. Stacked AE. Denoising AE. Sparse AE. Contractive AE. Variational AE GAN.

Defense Data Generation in Distributed Deep Learning System Se-Yoon Oh / ADD-IDAR

Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform. Xintao Wang Ke Yu Chao Dong Chen Change Loy

arxiv: v1 [cs.cv] 16 Jul 2017

Adversarially Learned Inference

Learning to generate with adversarial networks

Mode Regularized Generative Adversarial Networks

Deep Generative Models and a Probabilistic Programming Library

arxiv: v2 [cs.cv] 26 Mar 2017

Lecture 19: Generative Adversarial Networks

arxiv: v1 [cs.cv] 1 Aug 2017

Stacking VAE and GAN for Context-aware Text-to-Image Generation

CNN for Low Level Image Processing. Huanjing Yue

Generative Adversarial Networks (GANs)

TGANv2: Efficient Training of Large Models for Video Generation with Multiple Subsampling Layers

Introduction to GAN. Generative Adversarial Networks. Junheng(Jeff) Hao

Human Pose Estimation with Deep Learning. Wei Yang

arxiv: v1 [cs.cv] 20 Sep 2018

Lecture 3 GANs and Their Applications in Image Generation

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

arxiv: v1 [cs.ne] 11 Jun 2018

Auto-encoder with Adversarially Regularized Latent Variables

Tempered Adversarial Networks

DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION

Stacked Generative Adversarial Networks

Image Restoration with Deep Generative Models

DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION

Machine Learning 13. week

Deep Fakes using Generative Adversarial Networks (GAN)

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon

Data Set Extension with Generative Adversarial Nets

LEARNING TO GENERATE CHAIRS WITH CONVOLUTIONAL NEURAL NETWORKS

Show, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks

Variational Autoencoders. Sargur N. Srihari

Paired 3D Model Generation with Conditional Generative Adversarial Networks

SiftingGAN: Generating and Sifting Labeled Samples to Improve the Remote Sensing Image Scene Classification Baseline in vitro

Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks Supplementary Material

Autoencoding Beyond Pixels Using a Learned Similarity Metric

INF 5860 Machine learning for image classification. Lecture 11: Visualization Anne Solberg April 4, 2018

AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015)

Restricted Boltzmann Machines. Shallow vs. deep networks. Stacked RBMs. Boltzmann Machine learning: Unsupervised version

Know your data - many types of networks

Overall Description. Goal: to improve spatial invariance to the input data. Translation, Rotation, Scale, Clutter, Elastic

Index. Springer Nature Switzerland AG 2019 B. Moons et al., Embedded Deep Learning,

CENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan

Generative Adversarial Network: a Brief Introduction. Lili Mou

HENet: A Highly Efficient Convolutional Neural. Networks Optimized for Accuracy, Speed and Storage

arxiv: v2 [cs.cv] 2 Dec 2017

RECENT years have witnessed the rapid growth of image. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval

arxiv: v1 [cs.cv] 4 Feb 2019

arxiv: v1 [stat.ml] 19 Aug 2017

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

arxiv: v3 [cs.cv] 30 Mar 2018

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Introduction to Generative Models (and GANs)

GENERATIVE ADVERSARIAL NETWORK-BASED VIR-

Spatial Localization and Detection. Lecture 8-1

Bilinear Models for Fine-Grained Visual Recognition

Convolutional Neural Networks: Applications and a short timeline. 7th Deep Learning Meetup Kornel Kis Vienna,

ECE 599/692 Deep Learning. Lecture 12 GAN - Introduction

Deep Generative Models Variational Autoencoders

Inception Network Overview. David White CS793

DEEP STRUCTURED OUTPUT LEARNING FOR UNCONSTRAINED TEXT RECOGNITION

From attribute-labels to faces: face generation using a conditional generative adversarial network

Generative Adversarial Networks (GANs) Based on slides from Ian Goodfellow s NIPS 2016 tutorial

Conditional DCGAN For Anime Avatar Generation

Transcription:

Lab meeting (Paper review session) Stacked Generative Adversarial Networks 2017. 02. 01. Saehoon Kim (Ph. D. candidate) Machine Learning Group

Papers to be covered Stacked Generative Adversarial Networks X. Huang (Cornell), Y. Li (Cornell), O. Poursaeed (Cornell), J. Hopcroft (Cornell), S. Belongie (Cornell) arxiv:1612.04357v1 [cs.cv] 13 Dec 2016 StackGAN: Text to Photo-realistic Synthesis with Stacked Generative Adversarial Networks H. Zhang (Rutgers Univ.) et al. arxiv:1612.03242v1 [cs.cv] 10 Dec 2016 1

Generative Adversarial Networks The generator and discriminator play the following twoplayer minimax game with value function The generator maps a latent space to a data space The discriminator represents the probability that the X came from the data rather than 2

An Theoretical Analysis of GANs [1] [1] Generative Adversarial Nets, I. J. Goodfellow, et al, NIPS 14 3

Practical Learning Techniques [1] To train the generative network, the objective function is slightly twisted (no theoretical analysis is guaranteed) [1] Improved techniques to train GANs, T. Salimans, et al, NIPS 16 4

Inception Score We apply the Inception model (GoogLeNet) to get the conditional label distribution We expect that a well-generated image has a conditional label distribution with low entropy We expect that the model to generate varied images the marginal with high entropy The following score is very natural to assess the quality of generative models 5

Deep Convolutional GANs (DCGANs) [1] 100-(4x4x1024) projection matrix Transposed convolution (a.k.a. deconvolution) [1] Unsupervised representation with deep convolutional GANs, A. Radford et al, ICLR 16 6

Transposed convolution [1] https://github.com/vdumoulin/conv_arithmetic 7

Stacked Generative Adversarial Networks X. Huang, Y. Li, O. Poursaeed, J. Hopcroft, S. Belongie (Cornell University) In this paper we aim to leverage the powerful bottom-up discriminative representations to guide a top-down generative model. We propose a novel generative model named Stacked Generative Adversarial Networks (SGAN), which is trained to invert the hierarchical representations of a discriminative bottom-up deep network. Our model consists of a top-down stack of GANs, each trained to generate plausible lower-level representations, conditioned on higher level representations. A representation discriminator is introduced at each feature hierarchy to encourage the representation manifold of the generator to align with that of the bottom-up discriminative network, providing intermediate supervision. In addition, we introduce a conditional loss that encourages the use of conditional information from the layer above, and a novel entropy loss that maximizes a variational lower bound on the conditional entropy of generator outputs. To the best of our knowledge, the entropy loss is the first attempt to tackle the conditional model collapse problem that is common in conditional GANs. We first train each GAN of the stack independently, and then we train the stack end-to-end. Unlike the original GAN that uses a single noise vector to represent all the variations, our SGAN decomposes variations into multiple levels and gradually resolves uncertainties in the top-down generative process. 8

Hierarchical image generation Lower-level representation conditioned on higher-level representation 9

Stacked Generative Adversarial Network (SGAN) [Pre-trained Encoder] Convolution Pooling Convolution Pooling Fullyconnected input conv1 pool1 conv2 pool2 fc3 fc4 10

Stacked Generative Adversarial Network (SGAN) [Stacked Generators] Our goal is to train a top-down generator G that inverts E G consists of a top-down stack of generators G i which is trained to invert a bottom-up mapping E i The definition of each generator is defined as follows: 11

An overview of image generation New images can be sampled from SGAN by feeding random noise to each generator This is different from DCGAN, because multiple noise variables are considered to generate the single image Each generator can be designed by transposed convolution operators 12

An overview of SGAN 13

Training Discriminator [Standard Loss] A discriminator D i distinguishes generated representation h i from real representations h i The loss for the discriminator is defined as 14

Training Generator (1/3) [Adversarial Loss] They first train each GAN independently by using adv,indep L Gi They train them jointly in an end-to-end manner by using adv,joint L Gi 15

Training Generator (2/3) [Conditional Loss] They regularize the generator by feeding the generated lower-level representations back to the encoder They enforce the recovered representations to be similar to the original representations 16

Training Generator (3/3) [Entropy Loss] They encourage the generated representation h i to be sufficiently diverse when conditioned on h i+1 The conditional entropy H( h i h i+1 ) should be as high as possible They propose to maximize a variational lower bound on the conditional entropy 17

Experiments [Encoder] They use a small CNN as the encoder: conv1-pool1- conv2-pool2-fc3-fc4-softmax [Generator] The top GAN G 1 generates fc3 features from some random noise z 1, conditioned on label y The bottom GAN G 0 generates images from some random noise z 0, conditioned on fc3 features from GAN G 1 18

SVHN Results 19

CIFAR Results 20

Inception Scores (CIFAR-10) 21

Stack GAN: Text to Photo-realist Image Synthesis H. Zhang (Rutgers), T. Xu (Lehigh Univ.), H. Li (CUHK), S. Zhang (UNC), X. Huang (Lehigh Univ.), X. Wang (CUHK), D. Metaxas (Rutgers) In this paper, we propose stacked Generative Adversarial Networks (StackGAN) to generate photo-realistic images conditioned on text descriptions. The Stage-I GAN sketches the primitive shape and basic colors of the object based on the given text description, yielding Stage-I low resolution images. The Stage-II GAN takes Stage-I results and text descriptions as inputs, and generates high resolution images with photorealistic details. The Stage-II GAN is able to rectify defects and add compelling details with the refinement process. Samples generated by StackGAN are more plausible than those generated by existing approaches. Importantly, our StackGAN for the first time generates realistic 256 256 images conditioned on only text descriptions, while state-of-the-art methods can generate at most 128 128 images. To demonstrate the effectiveness of the proposed StackGAN, extensive experiments are conducted on CUB and Oxford-102 datasets. 22

Motivating Examples 23

The architecture of StackGAN 24

Stage-I GAN [Model Architecture] Transposed Conv. Feedforward NN LSTM or CNN with word embedding Dimension Reduction & Reshaped 25

Stage-II GAN [Model Architecture] Same with Stage-I generator Standard CNN Transposed Conv. Same with Stage-I discriminator 26

Examples 27

Comparison between Stage I and II 28