arxiv: v1 [cs.cv] 16 Nov 2017
|
|
- Madlyn Miles
- 6 years ago
- Views:
Transcription
1 Two Birds with One Stone: Iteratively Learn Facial Attributes with GANs arxiv: v1 [cs.cv] 16 Nov 2017 Dan Ma, Bin Liu, Zhao Kang University of Electronic Science and Technology of China Zenglin Xu University of Electronic Science and Technology of China Abstract Generating high fidelity identity-preserving faces has a wide range of applications. Although a number of generative models have been developed to tackle this problem, it is still far from satisfying. Recently, Generative adversarial network (GAN) has shown great potential for generating or transforming images of exceptional visual fidelity. In this paper, we propose to train GAN iteratively via regularizing the minmax process with an integrated loss, which includes not only the per-pixel loss but also the perceptual loss. We argue that the perceptual information benefits the output of a high-quality image, while preserving the identity information. In contrast to the existing methods only deal with either image generation or transformation, our proposed iterative architecture can achieve both of them. Experiments on the multi-label facial dataset CelebA demonstrate that the proposed model has excellent performance on recognizing multiple attributes, generating a high-quality image, and transforming image with controllable attributes. 1. Introduction Learning controllable attributes from images is fundamental to a number of real-world applications in computer vision, such as image generation [7, 24, 21, 17] and image transformation [8, 27, 25, 28]. Specifically, image generation aims to learn a complex mapping from a latent vector to the generated realistic image, and image transformation is to translate an input image into the output image with desired attributes changing. They have wide applications in practice. For example, facial composite, which is a graphical reconstruction of an eyewitness s memory of a face [15], can assists police to identify a suspect. In most situations, police need to search a suspect with only one picture of the front view. To improve the success rate, it is very necessary to generate more pictures of the target person with different zenglin@gmail.com Jianke Zhu Zhejiang University jkzhu@zju.edu.cn poses or expressions. Therefore, face generation and transformation have been extensively studied. Benefiting from the flourishing of deep learning method, face generation and transformation have made significant advances in recent years [3, 1]. With the deep architecture, image generation or transformation could be modeled in more flexible ways. For example, the conditional Pixel- CNN [1] was developed to generate an image based on the PixelCNN. The generation process of the proposed model can be conditioned on visible tags or latent code from other networks. However, the quality of images generated and convergence speed are unsatisfactory. In [7] and [24], Variational Auto-encoders (VAE) [11] was proposed to generate natural images. Another famous generative deep model, the Generative Adversarial Networks (GAN) [6] was also adopted to generate a natural image within a Laplacian pyramid framework [2]. The most popular way of generating or transforming image is conditional GAN [16] based models, which allow researchers to manipulate images in controllable settings. This benefit lots of work on image transformation [8, 25, 27] and generation with visual attributes [28, 17]. Although these existing models can deal with face generation or transformation with conditional settings, they fail to pay enough attention to face attribute learning, which is crucial to the output of high-quality facial images. Specifically, there are several drawbacks in current approaches. First, most of the existing conditional deep models cannot preserve the facial identity when transforming the input face image to output [8] or generating new faces[24]; second, most of the current models cannot well deal with multiple attributes, which might deteriorate the quality of facial images. To this end, we propose a iterative GAN with auxiliary classifier in this paper, which not only can generate high fidelity face images with controlled input attributes, more importantly, it also can cope with both image generation 4321
2 and transformation with desired attributes changing. Furthermore, it is capable of dealing with both single and multiple labels (attributes). To achieve this goal, we train the model with two steps as shown in Figure 1. Step one, we train the Discriminator D, Generator G, and Classifier C with adversarial losses [6], the data source losses, and the label losses as in [17]. Step two, G and D/C are iteratively trained with an integrated loss, which includes perceptual loss [9] between D s hidden layers in steps 1 and step 2, latent code loss between input noise z and output noise z, and pixel loss between input real face image and its corresponding translated version. With the proposed model, on one hand, the Generator G can not only generate a high-quality natural image according to the input attribute (single or multiple), but also translate an input face image with desired attributes substituted. The fidelity of the output images can be highly preserved since the iterative training with integrated losses. On the other hand, the auxiliary classifier C can recognize the attributes of an input human image with high accuracy due to the integrate losses. We evaluate our model on the celeba dataset [14] with two groups of experiments. First, we calculate the accuracy of human face attributes recognition and compare it with the state-of-the-art methods [16, 5, 17]. Second, we evaluate the quality of generating natural human face images with specified attributes. In summary, the contribution of this paper are three-folds: We propose a variant GAN that trained with not only the minmax loss but also an integrated reconstruction loss including both the per-pixel loss and the perceptual loss. Regularizing the traditional minmax process with the integrated loss results in both facial identity and high face quality preserving. Experiments demonstrate that the proposed model can not only generate a high-quality face image, but also transform a given face with facial identity-preserving. Multiple face attributes can be manipulated during the two processes. Our proposed method also show superiorperformance on facial image attribute recognition. 2. Related Work 2.1. Conditioned Image Generation Image generation is a very popular topic in computer vision. The vision community has already taken significant steps in this direction with the flourishing of deep learning. In [1], the authors presented an image generation model based on PixelCNN under conditional control. However, it faces the challenges of generating a high-quality image and the efficiency of convergence. The conditional models [16, 10] provide us an opportunity to make sample generation process controllable. In the last three years, generating image with Variational Autoencoders (VAE) [11] and GAN [6] have been investigated. A recurrent VAE [7] has been proposed to model every stage of picture generation. Yan et al. [24] used two attributes conditioned VAEs to capture foreground and background of a facial image, respectively. In addition, sequential model [7, 2] has attracted lots of attention recently. The recurrent VAE in [7] was proposed to mimic the process of human drawing but this model can only be applied to low-resolution images; in [2], a cascade Laplacian pyramid model was proposed to generate image from low resolution to a final full resolution gradually. There are also a lot of work apply GAN [6] to image generation with conditional setting since Mirza and Osindero [16] and Gauthier [5]. In [2], a Laplacian pyramid framework was adopted to generate a natural image with each level of the pyramid trained with GAN. In [23], the S 2 GAN was proposed to divide the image generation process into structure and style generation steps corresponding to two dependent GAN. The Auxiliary Classifier GAN (ACGAN) in [17] tries to regularize traditional conditional GAN with label consistency. To be more easy to control the conditional tag, the ACGAN extended the original conditional GAN with an auxiliary classifier C Image Transformation with GAN Image transformation is a well-established area in computer vision. In this subsection, we summarize prior work of image transformation with GAN. The existing models fall into two groups. Manipulating images over the natural image manifold with conditional GAN belongs to the first category. In [27], the authors defined user-controlled visual image editing operations based on a low dimensional image attribute manifold, such as the color or shape. In a similar way, Zhang et al. [26] assume that the age attribute of face images lie on a high-dimensional manifold. By stepping along the manifold, this model can obtain face images with different age attributes. Other work consists of the second group. In [8], the transformation model was built on the conditional setting by regularizing the traditional conditional GAN [16] model with an image mapping loss. Two GANs for both the input and output domains are also used [25, 28]. In [25], a set of aligned image pairs are required to transfer the source image of a dressed person to product photo model. In contrast, the cycle GAN [28] can learn to translate an image from a source domain to a target domain in the absence of paired samples. 4322
3 3. Proposed Model We first describe the proposed model with the integrated loss, then explain each component of the integrated loss Problem Overview Figure 1 illustrates the pipeline of our model. It is a variant of GAN that regularized by several losses. This framework can deal with both face generation and transformation. By sampling a random vector z and a desired facial attribute description, a new face can be generated by G. For face transformation, we feed the Discriminator D a real image x real together with its attribute labels c, then we get a noise representation z and label vector c. The original image can be reconstructed with z and c by feeding them back to the generator G. The new image generated by G is named x rebuild. Alternatively, we can modify the content of c before inputting G. According to the modification of labels, the corresponding attributes of the reconstructed image will be transformed. In the proposed model, the Discriminator D and Generator G can be re-tuned along with three kinds of specialized losses. In particular, the perceptual losses captured by the hidden layers of Discriminator will be back propagated to feed the network with the semantic information. In addition, the difference between the noise code detached by the classifier C and the original noise will further tune the networks. Last but not least, we let the generator G to rebuild the image x rebuild, then the error between x rebuild and the original image x real will also fed back. Experiments demonstrate that aforementioned three loss functions play indispensable roles in generating natural images. The goal of our model is to obtain the optimal G, D, and C by solving the following optimization problem, argmin max L ACGAN (G, D, C) + L inte (G, D, C) (1) G D,C where L ACGAN is the ACGAN [17] loss term, L inte is the integrated loss term. Here, there is no balance parameter for these two losses because L inte is conic combination of another three losses (we will introduce them later). Now we will introduce in detail how these terms are defined in this paper ACGAN Loss The ACGAN losses [17] can be separated into adversarial loss [6] and label consistency loss as following, L ACGAN (G, D, C) = L adv + L label the L adv is the minmax loss defined in the original GAN [6, 16]. L label is the loss [17] of classifer C Adversarial Loss The generator G takes a random noise variable z as input and outputs a fake image x fake =G(z). In opposite, the discirminator D takes both the synthesized and geniue images as input and outputs the data sources. During the training process, the discriminator D is forced to maximize the likelihood it assigns the correct data source label [16, 17], L adv (G, D) = E x,c p(x,c) [log D(x c)] + E z p(z) [log(1 D(G(z c)))] The Consistency of Data Labels In our model, real samples and generated samples have multiple class labels, c. We train the classifier with real images x real and as well as fake images x fake = G(z, c) with their labels c. To train the classifier, we back propagate the loss between c and predicted c outputed by the classifier [17], which refers to the label loss as follows, L label = E x {xreal,x fake }[log C(D(x))] 3.3. Integrated Loss In this paper, we combine the per-pixel loss, perceptual loss and latent code loss with parameter λ = (λ 1, λ 2, λ 3 )to enhance the training process. L inte = λ 1 L per + λ 2 L pix + λ 3 L z, λ i 0. The conic coefficients λ i also suggests that we not need to set a trade-off parameter in Equation (1) Per-pixel Loss Per-pixel loss [11, 20, 3] is a straightforward way to pixelwisely measure the difference between two images. Though it may fail to capture the semantic information of an image, we still think it is a very important measure for image reconstruction that should not be ignored. We also feed backward the per-pixel loss to balance the training errors. Then our per-pixel loss is, L pix = E[ x real G(C(D(x real )), c ) )] where D(x real ) is a 1024 dimension vector that extracted from the input image x real by D, then the Classifier network C output a low dimension noise vector z = C(D(x real )) by receiving an input vector D(x real ). Therefore, the pre-pixel loss captures the pixel-wisely difference between the rebuilt image x rebuild = G(C(D(x real )) and the real image x real. 4323
4 Figure 1. The pipeline of our model Perceptual Loss Traditionally, per-pixel loss [11, 20, 3] has been extensively adopted in measuring the difference between two images. It is very efficient and popular in reconstructing images from ground-truth images. However, the pixel-based loss is not a robust measure since it cannot capture the semantic difference between two images [9]. For example, some unignorable defects, such as blurred results (lack of high-frequency information), artifacts (lack of perceptual information) [22], often exist in the output images reconstructed via the perpixel loss. To balance per-pixel loss, we feedback the training process with the perceptual loss [9, 12, 4, 22] between x real and x rebuild. The proposed perceptual loss is defined over the hidden layers of the Discriminator networks. Let h i and h i rebuild be the activations of the i-th layer of the Discriminator networks and the rebuild network, respectively, then the perceptual loss L per is as following, L per = E[ h i rebuild h i ]. We minimize the L per to make the perceptual information in h i rebuild consistent with hi Latent Code Loss During the training process, not only the input image x real has been rebuild as x rebuild, for the randomly sampled noise variable z, we also let the Discriminator D detach a rebuild noise vector z. The loss between the random noise z and z is also used to regularize the process of image generation, i.e., L z = E[ z z ] 4. Experiment 4.1. Dataset We run the iterative GAN model on celeba dataset [14]. This dataset is based on celebface+[19]. celeba is a largescale face image dateset contains more than 200k samples of celebrities. Each face contains 40 familiar attributes, such as Bags Under Eyes, Bald, Bangs, Blond Hair, etc. Owing to the rich annotations per image, celeba has been widely applied to face visual work like face attribute recognition, face detection, and landmark (or facial part) localization. We take advantage of the rich attribute annotations and train each label in a supervised learning approach. We split the whole dataset into 2 subsets: images of them are randomly selected as training data, the remaining samples are used to evaluate the results of the experiment as the test set Model Architecture Iterative-GAN includes three convolutional or deconvolutional neural networks. The generator consists a fully-connected layer with 8584 neurons, four deconvolutional layers and each has 256, 128, 64 and 3 channels with filer size 5*5. All filter strides of the Generator G are set to 2*2. After processed by the first fullyconnected layer, the input noise z with 100 dimensions is projected and reshaped to [Batch size, 5, 5, 512], the following 4 de-convolutional layers then transpose the tensor to [Batch size, 10, 10, 256], [Batch size, 20, 20, 128], [Batch size, 40, 40, 64], [Batch size, 80, 80, 3], respectively. The tensor output from the last layer is activated by the tanh function to normalize the data to the range of (-1, 1). The discriminator is organized almost in the reverse way 4324
5 4 shows the fail on predicting HeavyMakeUp and Sex Reconstruction Figure 2. The hamming loss of attribute prediction. of the Generator. It includes 4 convolutional layers with filter size 5*5 and stride size 2*2. The training images with shape [Batch size, 80, 80, 3] then transpose through the following 4 convolutional layers and receive a tensor with shape [Batch size, 5, 5, 512]. Since the discriminator needs to output both a shared layer for the classifier and probability to check the reality of the input image, we flat the tensor for the former purpose and reshape it to size [Batch size, 1] (with sigmoid active function) for the latter one. The classifier receives the shared layer from the discriminator which contains enough feature information. We build two full-connected layers in the classifier, one for the noise z output and the other for the output predicted labels. Each one uses the sigmoid function to squeeze the value to (0, 1). The detailed parameters of the model are shown in Table Results Recognition of facial attributes (multi-labels) Learning face attributes is fundamental to face generation and transformation. Previous work can learn to control single attribute [13] or multi-category attributes [17] that output by a softmax function for a given input image. However, the natural face images are always associated with multiple labels. To the best of our knowledge, there is no good solution to recognize and control the multi-label attributes for a given facial image. In our framework, Classifier C accepts 1024 dimensions shared vector that outputted by Discriminator D, then C squash it into 128 dimensions by a full connection. To output the multiple labels, we just need to let the Classifier C to compress the 128 dimension median vector into d dimensions. By linking the d dimensions vector to d sigmoid mappings, the Classifier C can finally output the attribute labels. We feed the images of the test set to the Classifier C and calculate the hamming loss for the multilabel prediction. The statistic of hamming loss is over all the 40 labels associated with each test images. The native DC- GAN with L2-SVM classifier is selected as baseline [18]. The DCGAN extracts the feature and feeds it to the linear SVM with Euclidean distance. Figure 2 illustrates the hamming loss of the two algorithms for predicting the 40 labels of the test data. Our model significantly outperforms DC- GAN. Row 2, 3 and 4 in Table 2 illustrate three examples in the test set Row 2 and 3 are the successful cases, while row By feeding the noise vector z and c, the Generator G can reconstruct the original input image by preserving attributes (x rebuild in Figure 1). In this part, we will evaluate the contributions of integrated loss in face reconstruction. In detail, we run three experiments by regularizing the ACGAN Loss with: only noise loss, noise loss + per-pixel loss, and noise + per-pixel + perceptual loss. The comparisons among the results of the three experiments are shown in Figure 3. The First column displays the original images to be rebuilt. The remains are corresponding to the three groups of experiments. From column 2 to column 4, we have three observations: i) noise loss prefers to preserve image quality (images reconstructed with noise + per-pixel loss in column 3 are blurrier than images reconstructed only with noise loss in column 2); ii) per-pixel loss try to hold the facial identity (images reconstructed with noise + per-pixel loss in column 3 are closer to the original images than column 2); iii) the integrated loss balance the effects of them to reconstruct the original faces with both high quality and identity preserving Face Transformation with Controllable Attributes Based on our framework, we feed the Discriminator D a real image without its attribute labels c, then we get a noise representation z and label vector c from D as output. We can reconstruct the original image with z and c as we did in Section Alternatively, we can also control the process of reconstruction. By modifying part or even all labels of c, the corresponding attributes of the reconstructed image will be transformed. We first control a single attribute. Then we extend it to multi-label scenario. We try to modify one of the attributes of the images on the test set. Figure 4 shows the part of the results of the transformation on the test set. The four rows of Figure 4 illustrate four different attributes Male, Eye Glasses, Bangs,Bald have been changed. The odd columns display the original images, and even columns illustrate the transformed images. We observe that the transformed images preserve high fidelity and their old attributes. By modifying more than one label in c, we can translate target images to a transformed version with the corresponding facial attributes changed. Figure 5 exhibit the results manipulating multiple attributes. 4325
6 G z(x) input D(x) input C(x) shared layer input Table 1. Architecture of models Operation Kernel Size Stride Batch Normal Nonlinearity Explanations Full-connected \ \ \ ReLu Transposed Convolution 5*5 2*2 Yes ReLu Transposed Convolution 5*5 2*2 Yes ReLu Transposed Convolution 5*5 2*2 Yes ReLu Transposed Convolution 5*5 2*2 Yes Tanh Convolution 5*5 2*2 No Leaky ReLu Convolution 5*6 2*3 Yes Leaky ReLu Convolution 5*7 2*4 Yes Leaky ReLu Convolution 5*8 2*5 Yes Leaky ReLu Linear \ \ Yes Leaky ReLu Shared Layer Linear \ \ Yes Sigmoid Veracity Probability Linear \ \ No Sigmoid Rebuild z Noise Linear \ \ Yes Leaky ReLu Linear \ \ No Sigmoid Rebuild Label z Loss + Target z Loss Pixel Loss Integrated Loss(Ours) Target Image Attribute Truth Prediction Bald Bangs BlackHair BlondeHair EyeGlass Male Nobeard Smiling WaveHair Young Attractive Bald Male Smiling HeavyMakeUp Wearinghat Young Attractive Bald EyeGlass HeaveMakeUp Male Table 2. Demonstrate the example of classification result of celeba. The listed ground-truth tags of target image are expressed by two integer 1 and -1. Row 1 and 2 show the exactly correct prediction examples. Row 3 demonstrates the missclassification example:the classifier failed to determine the attribute Heavy makeup and Male of the face which expressed in black font Face Generation with Controllable Attributes Figure 3. Comparison of rebuilding images through different losses. The first column shows the original nature-image. The 2nd column are images rebuilt with only z loss (latent code). Column 3 shows the effect of z loss + pixel loss effect. The last column shows the final effect of the integrated loss. Different from above, we can also generate a new face with a random z sampling from a given distribution and an artificial attribute description c (labels). The Generator G accepts z and c as the input and fabricates a fictitious facial image x fake as the output. Of course, we can customize an image by modifying the corresponding attribute descriptions in c. For example, the police would like to get a suspect s portrait by the witness s description He is around 40 years old bald man with arched eyebrows and big nose. Figure 6 illustrates the results of generating ficti- 4326
7 Sex Glasses Bangs Bald Figure 4. Four examples (sex, glasses, bangs, and bald as shown in the four rows) of face transformation with a single attribute changing. For each example (row), we display five illustrations. For instance, the first row shows the results of controlling the label of sex. The odd columns of row 1 are the given faces, the corresponding even columns display the faces with sex reversed. From the 1st and 5th instances (columns 1,2 and 9,10), we clearly see that the moustache disappeared from male to female. Target Rebuild Reversal:Bangs Reversal:Glasses Reversal:Sex Reversal:Smiling Reversal:Age Reversal:Bangs, Glasses Reversal:Bangs, Glasses, Smiling Reversal:Bangs, Glasses, Smiling, Age Figure 5. Demonstration of rebuilding the target image and reversing (also mean modify) some attributes of the original face simultaneously.the first column shows the target face. The rebuild faces are shown in the 2nd column with all attributes unchanged. Then we reverse the 5 single labels successively from 3rd column to 7th column. For example, the target face has the attribute Bangs, we reverse this value to 0 to eliminate this attribute with other attributes unchanged. The last 3 columns show the combination of attributes modification. tious facial images with random noise and descriptions. We sample z from the Uniform distribution z U(0, 1). The first column displays the images generated with z and initial descriptions. The remaining columns demonstrate the facial images generated with a modified attributes (including single or multiple modifications) Compare with Existing Method of Face Generation To examine whether the proposed model has advantages of generating realistic facial images, we compare the results of face generation of the proposed model with CGAN [16] and ACGAN [17]. These three models can all generate images from a conditioned attributes description. For all three of them, we begin to generate 4 random facial images as illustrated in the 1st, 3rd, and 5th column of Figure 7, respectively. The column 2, 4, and 6 display the regenerated images with four attributes (Eye Glasses, Smiling, Bangs, and Wave Hairs) modified for CGAN, ACGAN, and our model. It is clear to see that the face quality of ACGAN and our model is much better than CGAN. And most importantly, in contrast with the failure of preserving face identity (see 4327
8 Random build from z U(0, 1) +Bangs +Bush Eyebrows +Bangs +Bush Eyebrows +Eye Glasses +Bangs +Bangs +Bush Eyebrows +Bangs +Bush Eyebrows+Eye Glasses +Bush Eyebrows+Eye Glasses +Smiling +Eye Glasses +Smiling +Wave Hair Figure 6. Demonstration of building faces from random noise z which is sampled from the Uniform distribution together with label c. The first column shows the standard faces created with the noise vector z. The following columns display fake images based on standard faces in the first column. The 2nd, 3rd and 5th column exhibit faces with single attribute (bangs, bushy eyebrows, eye glasses) modified. Other columns are the examples of manipulating multiple attributes. For example, from column 1 to the last column, we have modified 5 attributes (bangs, bushy eyebrows, eye glasses, smiling, and wave hair). Glasses Smiling Bangs CGAN AC-GAN Our Model tribute; 2)generating high quality face images with multiple controllable attributes; 3)transforming an input face into an output one with desired attributes changing; 4)preserving the facial identity during face generation and transformation. For more results and code, see 5. Conclusion Wave Hair Figure 7. Comparing with CGAN and AC-GAN on face generation. The result shows that our model can produce desired image with comparable or even better quality than AC-GAN, and far better than native CGAN. More importantly, the proposed iterative architecture has an advantage of preserving facial identity. the intersections between column 3, 4 and row 1, 2 of AC- GAN), our model can always make a good face identitypreserving. In summary, extensive experimental results indicate that our method is capable of: 1)recognizing facial at- We propose an iterative GAN trained with not only the traditional adversarial loss but also regularized with an integrated loss including both the per-pixel loss and the perceptual loss. The perceptual loss defined over semantic difference between the original and reconstructed images can help to preserve the facial identity. More importantly, in contrast with the existing methods that can deal with only image generation or transformation, the proposed iterative architecture can achieve both of them. Experiments on a real-world face set demonstrate the advantages of the proposed model on face generation and face transformation. We can extend our model to other kinds of rich labeled image sets. The excellent performance of attribute recognition of the proposed model makes it very appealing on image 4328
9 caption. We leave it as future work. References [1] A. V. Den Oord, N. Kalchbrenner, O. Vinyals, L. Espeholt, A. Graves, and K. Kavukcuoglu. Conditional image generation with pixelcnn decoders. neural information processing systems, pages , [2] E. L. Denton, S. Chintala, A. Szlam, and R. D. Fergus. Deep generative image models using a laplacian pyramid of adversarial networks. neural information processing systems, pages , [3] C. Dong, C. C. Loy, K. He, and X. Tang. Learning a deep convolutional network for image super-resolution. pages , [4] A. Dosovitskiy and T. Brox. Generating images with perceptual similarity metrics based on deep networks. neural information processing systems, pages , [5] J. Gauthier. Conditional generative adversarial nets for convolutional face generation. Class Project for Stanford CS231N: Convolutional Neural Networks for Visual Recognition, Winter semester [6] I. J. Goodfellow, J. Pougetabadie, M. Mirza, B. Xu, D. Wardefarley, S. Ozair, A. C. Courville, and Y. Bengio. Generative adversarial nets. pages , [7] K. Gregor, I. Danihelka, A. Graves, D. J. Rezende, and D. Wierstra. Draw: A recurrent neural network for image generation. international conference on machine learning, pages , [8] P. Isola, J. Y. Zhu, T. Zhou, and A. A. Efros. Image-to-image translation with conditional adversarial networks [9] J. Johnson, A. Alahi, and L. Feifei. Perceptual losses for real-time style transfer and super-resolution. european conference on computer vision, pages , [10] D. P. Kingma, D. J. Rezende, S. Mohamed, and M. Welling. Semi-supervised learning with deep generative models. Advances in Neural Information Processing Systems, 4: , [11] D. P. Kingma and M. Welling. Auto-encoding variational bayes. international conference on learning representations, [12] A. B. L. Larsen, S. K. Snderby, H. Larochelle, and O. Winther. Autoencoding beyond pixels using a learned similarity metric. pages , [13] Z. Li and Y. Luo. Generate identity-preserving faces by generative adversarial networks. arxiv preprint arxiv: , [14] Z. Liu, P. Luo, X. Wang, and X. Tang. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), pages , [15] D. Mcquistonsurrett, L. D. Topp, and R. S. Malpass. Use of facial composite systems in us law enforcement agencies. Psychology Crime & Law, 12(5): , [16] M. Mirza and S. Osindero. Conditional generative adversarial nets. Computer Science, pages , [17] A. Odena, C. Olah, and J. Shlens. Conditional image synthesis with auxiliary classifier gans [18] A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. International Conference on Learning Representations, [19] Y. Sun, Y. Chen, X. Wang, and X. Tang. Deep learning face representation by joint identification-verification. neural information processing systems, pages , [20] M. Tatarchenko, A. Dosovitskiy, and T. Brox. Multi-view 3d models from single images with a convolutional network. european conference on computer vision, pages , [21] C. Wang, C. Wang, C. Xu, and D. Tao. Tag disentangled generative adversarial network for object image re-rendering. In Twenty-Sixth International Joint Conference on Artificial Intelligence, pages , [22] C. Wang, C. Xu, C. Wang, and D. Tao. Perceptual adversarial networks for image-to-image transformation [23] X. Wang and A. Gupta. Generative image modeling using style and structure adversarial networks. european conference on computer vision, pages , [24] X. Yan, J. Yang, K. Sohn, and H. Lee. Attribute2image: Conditional image generation from visual attributes. european conference on computer vision, pages , [25] D. Yoo, N. Kim, S. Park, A. S. Paek, and I. S. Kweon. Pixellevel domain transfer. european conference on computer vision, pages , [26] Z. Zhang, Y. Song, and H. Qi. Age progression/regression by conditional adversarial autoencoder. arxiv preprint arxiv: , [27] J. Zhu, P. Krahenbuhl, E. Shechtman, and A. A. Efros. Generative visual manipulation on the natural image manifold. european conference on computer vision, pages , [28] J. Y. Zhu, T. Park, P. Isola, and A. A. Efros. Unpaired imageto-image translation using cycle-consistent adversarial networks
Controllable Generative Adversarial Network
Controllable Generative Adversarial Network arxiv:1708.00598v2 [cs.lg] 12 Sep 2017 Minhyeok Lee 1 and Junhee Seok 1 1 School of Electrical Engineering, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul,
More informationarxiv: v1 [cs.cv] 5 Jul 2017
AlignGAN: Learning to Align Cross- Images with Conditional Generative Adversarial Networks Xudong Mao Department of Computer Science City University of Hong Kong xudonmao@gmail.com Qing Li Department of
More informationarxiv: v4 [cs.lg] 1 May 2018
Controllable Generative Adversarial Network arxiv:1708.00598v4 [cs.lg] 1 May 2018 Minhyeok Lee School of Electrical Engineering Korea University Seoul, Korea 02841 suam6409@korea.ac.kr Abstract Junhee
More informationAn Empirical Study of Generative Adversarial Networks for Computer Vision Tasks
An Empirical Study of Generative Adversarial Networks for Computer Vision Tasks Report for Undergraduate Project - CS396A Vinayak Tantia (Roll No: 14805) Guide: Prof Gaurav Sharma CSE, IIT Kanpur, India
More informationDeep Learning for Visual Manipulation and Synthesis
Deep Learning for Visual Manipulation and Synthesis Jun-Yan Zhu 朱俊彦 UC Berkeley 2017/01/11 @ VALSE What is visual manipulation? Image Editing Program input photo User Input result Desired output: stay
More informationarxiv: v1 [cs.cv] 7 Mar 2018
Accepted as a conference paper at the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN) 2018 Inferencing Based on Unsupervised Learning of Disentangled
More informationDOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION
DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION Yen-Cheng Liu 1, Wei-Chen Chiu 2, Sheng-De Wang 1, and Yu-Chiang Frank Wang 1 1 Graduate Institute of Electrical Engineering,
More informationDOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION
2017 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, SEPT. 25 28, 2017, TOKYO, JAPAN DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION Yen-Cheng Liu 1,
More informationGENERATIVE ADVERSARIAL NETWORKS (GAN) Presented by Omer Stein and Moran Rubin
GENERATIVE ADVERSARIAL NETWORKS (GAN) Presented by Omer Stein and Moran Rubin GENERATIVE MODEL Given a training dataset, x, try to estimate the distribution, Pdata(x) Explicitly or Implicitly (GAN) Explicitly
More informationDCGANs for image super-resolution, denoising and debluring
DCGANs for image super-resolution, denoising and debluring Qiaojing Yan Stanford University Electrical Engineering qiaojing@stanford.edu Wei Wang Stanford University Electrical Engineering wwang23@stanford.edu
More informationLearning Residual Images for Face Attribute Manipulation
Learning Residual Images for Face Attribute Manipulation Wei Shen Rujie Liu Fujitsu Research & Development Center, Beijing, China. {shenwei, rjliu}@cn.fujitsu.com Abstract Face attributes are interesting
More informationGenerative Adversarial Text to Image Synthesis
Generative Adversarial Text to Image Synthesis Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee Presented by: Jingyao Zhan Contents Introduction Related Work Method
More informationFrom attribute-labels to faces: face generation using a conditional generative adversarial network
From attribute-labels to faces: face generation using a conditional generative adversarial network Yaohui Wang 1,2, Antitza Dantcheva 1,2, and Francois Bremond 1,2 1 Inria, Sophia Antipolis, France 2 Université
More informationarxiv: v1 [cs.cv] 16 Jul 2017
enerative adversarial network based on resnet for conditional image restoration Paper: jc*-**-**-****: enerative Adversarial Network based on Resnet for Conditional Image Restoration Meng Wang, Huafeng
More informationPaired 3D Model Generation with Conditional Generative Adversarial Networks
Accepted to 3D Reconstruction in the Wild Workshop European Conference on Computer Vision (ECCV) 2018 Paired 3D Model Generation with Conditional Generative Adversarial Networks Cihan Öngün Alptekin Temizel
More informationUnsupervised Learning
Deep Learning for Graphics Unsupervised Learning Niloy Mitra Iasonas Kokkinos Paul Guerrero Vladimir Kim Kostas Rematas Tobias Ritschel UCL UCL/Facebook UCL Adobe Research U Washington UCL Timetable Niloy
More informationLearning to generate with adversarial networks
Learning to generate with adversarial networks Gilles Louppe June 27, 2016 Problem statement Assume training samples D = {x x p data, x X } ; We want a generative model p model that can draw new samples
More informationarxiv: v2 [cs.cv] 6 Dec 2017
Arbitrary Facial Attribute Editing: Only Change What You Want arxiv:1711.10678v2 [cs.cv] 6 Dec 2017 Zhenliang He 1,2 Wangmeng Zuo 4 Meina Kan 1 Shiguang Shan 1,3 Xilin Chen 1 1 Key Lab of Intelligent Information
More informationarxiv: v1 [cs.cv] 1 Nov 2018
Examining Performance of Sketch-to-Image Translation Models with Multiclass Automatically Generated Paired Training Data Dichao Hu College of Computing, Georgia Institute of Technology, 801 Atlantic Dr
More informationIntroduction to Generative Adversarial Networks
Introduction to Generative Adversarial Networks Luke de Oliveira Vai Technologies Lawrence Berkeley National Laboratory @lukede0 @lukedeo lukedeo@vaitech.io https://ldo.io 1 Outline Why Generative Modeling?
More informationarxiv: v1 [cs.cv] 27 Mar 2018
arxiv:1803.09932v1 [cs.cv] 27 Mar 2018 Image Semantic Transformation: Faster, Lighter and Stronger Dasong Li, Jianbo Wang Shanghai Jiao Tong University, ZheJiang University Abstract. We propose Image-Semantic-Transformation-Reconstruction-
More informationDeep Fakes using Generative Adversarial Networks (GAN)
Deep Fakes using Generative Adversarial Networks (GAN) Tianxiang Shen UCSD La Jolla, USA tis038@eng.ucsd.edu Ruixian Liu UCSD La Jolla, USA rul188@eng.ucsd.edu Ju Bai UCSD La Jolla, USA jub010@eng.ucsd.edu
More informationAttribute Augmented Convolutional Neural Network for Face Hallucination
Attribute Augmented Convolutional Neural Network for Face Hallucination Cheng-Han Lee 1 Kaipeng Zhang 1 Hu-Cheng Lee 1 Chia-Wen Cheng 2 Winston Hsu 1 1 National Taiwan University 2 The University of Texas
More informationProgress on Generative Adversarial Networks
Progress on Generative Adversarial Networks Wangmeng Zuo Vision Perception and Cognition Centre Harbin Institute of Technology Content Image generation: problem formulation Three issues about GAN Discriminate
More informationarxiv: v1 [cs.cv] 1 May 2018
Conditional Image-to-Image Translation Jianxin Lin 1 Yingce Xia 1 Tao Qin 2 Zhibo Chen 1 Tie-Yan Liu 2 1 University of Science and Technology of China 2 Microsoft Research Asia linjx@mail.ustc.edu.cn {taoqin,
More informationarxiv: v1 [cs.lg] 12 Jul 2018
arxiv:1807.04585v1 [cs.lg] 12 Jul 2018 Deep Learning for Imbalance Data Classification using Class Expert Generative Adversarial Network Fanny a, Tjeng Wawan Cenggoro a,b a Computer Science Department,
More informationLab meeting (Paper review session) Stacked Generative Adversarial Networks
Lab meeting (Paper review session) Stacked Generative Adversarial Networks 2017. 02. 01. Saehoon Kim (Ph. D. candidate) Machine Learning Group Papers to be covered Stacked Generative Adversarial Networks
More informationComposable Unpaired Image to Image Translation
Composable Unpaired Image to Image Translation Laura Graesser New York University lhg256@nyu.edu Anant Gupta New York University ag4508@nyu.edu Abstract There has been remarkable recent work in unpaired
More informationarxiv: v1 [cs.cv] 19 Nov 2018
Show, Attend and Translate: Unpaired Multi-Domain Image-to-Image Translation with Visual Attention arxiv:1811.07483v1 [cs.cv] 19 Nov 2018 Abstract Honglun Zhang 1, Wenqing Chen 1, Jidong Tian 1, Yongkun
More informationStarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation
StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation Yunjey Choi 1,2 Minje Choi 1,2 Munyoung Kim 2,3 Jung-Woo Ha 2 Sunghun Kim 2,4 Jaegul Choo 1,2 1 Korea University
More informationAlternatives to Direct Supervision
CreativeAI: Deep Learning for Graphics Alternatives to Direct Supervision Niloy Mitra Iasonas Kokkinos Paul Guerrero Nils Thuerey Tobias Ritschel UCL UCL UCL TUM UCL Timetable Theory and Basics State of
More informationDeep Learning for Face Recognition. Xiaogang Wang Department of Electronic Engineering, The Chinese University of Hong Kong
Deep Learning for Face Recognition Xiaogang Wang Department of Electronic Engineering, The Chinese University of Hong Kong Deep Learning Results on LFW Method Accuracy (%) # points # training images Huang
More informationarxiv: v1 [cs.cv] 16 Nov 2015
Coarse-to-fine Face Alignment with Multi-Scale Local Patch Regression Zhiao Huang hza@megvii.com Erjin Zhou zej@megvii.com Zhimin Cao czm@megvii.com arxiv:1511.04901v1 [cs.cv] 16 Nov 2015 Abstract Facial
More informationarxiv: v2 [cs.cv] 26 Mar 2017
TAC-GAN Text Conditioned Auxiliary Classifier Generative Adversarial Network arxiv:1703.06412v2 [cs.cv] 26 ar 2017 Ayushman Dash 1 John Gamboa 1 Sheraz Ahmed 3 arcus Liwicki 14 uhammad Zeshan Afzal 12
More informationarxiv: v1 [cs.cv] 17 Nov 2016
Inverting The Generator Of A Generative Adversarial Network arxiv:1611.05644v1 [cs.cv] 17 Nov 2016 Antonia Creswell BICV Group Bioengineering Imperial College London ac2211@ic.ac.uk Abstract Anil Anthony
More informationarxiv: v1 [stat.ml] 19 Aug 2017
Semi-supervised Conditional GANs Kumar Sricharan 1, Raja Bala 1, Matthew Shreve 1, Hui Ding 1, Kumar Saketh 2, and Jin Sun 1 1 Interactive and Analytics Lab, Palo Alto Research Center, Palo Alto, CA 2
More informationDeep Feature Interpolation for Image Content Changes
Deep Feature Interpolation for Image Content Changes by P. Upchurch, J. Gardner, G. Pleiss, R. Pleiss, N. Snavely, K. Bala, K. Weinberger Presenter: Aliya Amirzhanova Master Scientific Computing, Heidelberg
More informationarxiv: v1 [cs.cv] 19 Apr 2017
Generative Face Completion Yijun Li 1, Sifei Liu 1, Jimei Yang 2, and Ming-Hsuan Yang 1 1 University of California, Merced 2 Adobe Research {yli62,sliu32,mhyang}@ucmerced.edu jimyang@adobe.com arxiv:1704.05838v1
More informationVisual Recommender System with Adversarial Generator-Encoder Networks
Visual Recommender System with Adversarial Generator-Encoder Networks Bowen Yao Stanford University 450 Serra Mall, Stanford, CA 94305 boweny@stanford.edu Yilin Chen Stanford University 450 Serra Mall
More informationMachine Learning 13. week
Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of
More informationarxiv: v1 [cs.cv] 9 Nov 2018
Typeface Completion with Generative Adversarial Networks Yonggyu Park, Junhyun Lee, Yookyung Koh, Inyeop Lee, Jinhyuk Lee, Jaewoo Kang Department of Computer Science and Engineering Korea University {yongqyu,
More informationGenerative Face Completion
Generative Face Completion Yijun Li 1, Sifei Liu 1, Jimei Yang 2, and Ming-Hsuan Yang 1 1 University of California, Merced 2 Adobe Research {yli62,sliu32,mhyang}@ucmerced.edu jimyang@adobe.com Abstract
More informationarxiv: v1 [cs.cv] 8 Jan 2019
GILT: Generating Images from Long Text Ori Bar El, Ori Licht, Netanel Yosephian Tel-Aviv University {oribarel, oril, yosephian}@mail.tau.ac.il arxiv:1901.02404v1 [cs.cv] 8 Jan 2019 Abstract Creating an
More informationarxiv: v1 [cs.ne] 11 Jun 2018
Generative Adversarial Network Architectures For Image Synthesis Using Capsule Networks arxiv:1806.03796v1 [cs.ne] 11 Jun 2018 Yash Upadhyay University of Minnesota, Twin Cities Minneapolis, MN, 55414
More informationarxiv: v2 [cs.cv] 16 Dec 2017
CycleGAN, a Master of Steganography Casey Chu Stanford University caseychu@stanford.edu Andrey Zhmoginov Google Inc. azhmogin@google.com Mark Sandler Google Inc. sandler@google.com arxiv:1712.02950v2 [cs.cv]
More informationRecovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform. Xintao Wang Ke Yu Chao Dong Chen Change Loy
Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform Xintao Wang Ke Yu Chao Dong Chen Change Loy Problem enlarge 4 times Low-resolution image High-resolution image Previous
More informationGAN and Feature Representation. Hung-yi Lee
GAN and Feature Representation Hung-yi Lee Outline Generator (Decoder) Discrimi nator + Encoder GAN+Autoencoder x InfoGAN Encoder z Generator Discrimi (Decoder) x nator scalar Discrimi z Generator x scalar
More informationSemantic Image Synthesis via Adversarial Learning
Semantic Image Synthesis via Adversarial Learning Hao Dong, Simiao Yu, Chao Wu, Yike Guo Imperial College London {hao.dong11, simiao.yu13, chao.wu, y.guo}@imperial.ac.uk Abstract In this paper, we propose
More informationDeep Manga Colorization with Color Style Extraction by Conditional Adversarially Learned Inference
Information Engineering Express International Institute of Applied Informatics 2017, Vol.3, No.4, P.55-66 Deep Manga Colorization with Color Style Extraction by Conditional Adversarially Learned Inference
More informationSupplementary Material: Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos
Supplementary Material: Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos Kihyuk Sohn 1 Sifei Liu 2 Guangyu Zhong 3 Xiang Yu 1 Ming-Hsuan Yang 2 Manmohan Chandraker 1,4 1 NEC Labs
More informationDeep generative models of natural images
Spring 2016 1 Motivation 2 3 Variational autoencoders Generative adversarial networks Generative moment matching networks Evaluating generative models 4 Outline 1 Motivation 2 3 Variational autoencoders
More informationAdversarially Learned Inference
Institut des algorithmes d apprentissage de Montréal Adversarially Learned Inference Aaron Courville CIFAR Fellow Université de Montréal Joint work with: Vincent Dumoulin, Ishmael Belghazi, Olivier Mastropietro,
More informationSYNTHESIS OF IMAGES BY TWO-STAGE GENERATIVE ADVERSARIAL NETWORKS. Qiang Huang, Philip J.B. Jackson, Mark D. Plumbley, Wenwu Wang
SYNTHESIS OF IMAGES BY TWO-STAGE GENERATIVE ADVERSARIAL NETWORKS Qiang Huang, Philip J.B. Jackson, Mark D. Plumbley, Wenwu Wang Centre for Vision, Speech and Signal Processing University of Surrey, Guildford,
More informationA Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation
A Unified Feature Disentangler for Multi-Domain Image Translation and Manipulation Alexander H. Liu 1 Yen-Cheng Liu 2 Yu-Ying Yeh 3 Yu-Chiang Frank Wang 1,4 1 National Taiwan University, Taiwan 2 Georgia
More informationTHIS work investigates the facial attribute editing task, AttGAN: Facial Attribute Editing by Only Changing What You Want
SUBMITTED MANUSCRIPT TO IEEE TRANSACTIONS ON IMAGE PROCESSING 1 AttGAN: Facial Attribute Editing by Only Changing What You Want Zhenliang He, Wangmeng Zuo, Senior Member, IEEE, Meina Kan, Member, IEEE,
More informationGenerative Adversarial Network
Generative Adversarial Network Many slides from NIPS 2014 Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio Generative adversarial
More informationCost-alleviative Learning for Deep Convolutional Neural Network-based Facial Part Labeling
[DOI: 10.2197/ipsjtcva.7.99] Express Paper Cost-alleviative Learning for Deep Convolutional Neural Network-based Facial Part Labeling Takayoshi Yamashita 1,a) Takaya Nakamura 1 Hiroshi Fukui 1,b) Yuji
More informationAttribute-Guided Face Generation Using Conditional CycleGAN
Attribute-Guided Face Generation Using Conditional CycleGAN Yongyi Lu 1[0000 0003 1398 9965], Yu-Wing Tai 2[0000 0002 3148 0380], and Chi-Keung Tang 1[0000 0001 6495 3685] 1 The Hong Kong University of
More informationGenerative Models II. Phillip Isola, MIT, OpenAI DLSS 7/27/18
Generative Models II Phillip Isola, MIT, OpenAI DLSS 7/27/18 What s a generative model? For this talk: models that output high-dimensional data (Or, anything involving a GAN, VAE, PixelCNN, etc) Useful
More informationSiftingGAN: Generating and Sifting Labeled Samples to Improve the Remote Sensing Image Scene Classification Baseline in vitro
1 SiftingGAN: Generating and Sifting Labeled Samples to Improve the Remote Sensing Image Scene Classification Baseline in vitro Dongao Ma, Ping Tang, and Lijun Zhao arxiv:1809.04985v4 [cs.cv] 30 Nov 2018
More informationGAN Related Works. CVPR 2018 & Selective Works in ICML and NIPS. Zhifei Zhang
GAN Related Works CVPR 2018 & Selective Works in ICML and NIPS Zhifei Zhang Generative Adversarial Networks (GANs) 9/12/2018 2 Generative Adversarial Networks (GANs) Feedforward Backpropagation Real? z
More informationAdaDepth: Unsupervised Content Congruent Adaptation for Depth Estimation
AdaDepth: Unsupervised Content Congruent Adaptation for Depth Estimation Introduction Supplementary material In the supplementary material, we present additional qualitative results of the proposed AdaDepth
More informationA FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS. Kuan-Chuan Peng and Tsuhan Chen
A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS Kuan-Chuan Peng and Tsuhan Chen School of Electrical and Computer Engineering, Cornell University, Ithaca, NY
More informationDomain Translation with Conditional GANs: from Depth to RGB Face-to-Face
Domain Translation with Conditional GANs: from Depth to RGB Face-to-Face Matteo Fabbri Guido Borghi Fabio Lanzi Roberto Vezzani Simone Calderara Rita Cucchiara Department of Engineering Enzo Ferrari University
More informationDe-mark GAN: Removing Dense Watermark With Generative Adversarial Network
De-mark GAN: Removing Dense Watermark With Generative Adversarial Network Jinlin Wu, Hailin Shi, Shu Zhang, Zhen Lei, Yang Yang, Stan Z. Li Center for Biometrics and Security Research & National Laboratory
More informationConditional DCGAN For Anime Avatar Generation
Conditional DCGAN For Anime Avatar Generation Wang Hang School of Electronic Information and Electrical Engineering Shanghai Jiao Tong University Shanghai 200240, China Email: wang hang@sjtu.edu.cn Abstract
More informationFine-grained Multi-attribute Adversarial Learning for Face Generation of Age, Gender and Ethnicity
Fine-grained Multi-attribute Adversarial Learning for Face Generation of Age, Gender and Ethnicity Lipeng Wan 1+, Jun Wan 2+, Yi Jin 1, Zichang Tan 2, Stan Z. Li 2 1 School of Computer and Information
More informationAge Progression/Regression by Conditional Adversarial Autoencoder
Age Progression/Regression by Conditional Adversarial Autoencoder Zhifei Zhang, Yang Song, Hairong Qi The University of Tennessee, Knoxville, TN, USA {zzhang61, ysong18, hqi}@utk.edu Abstract If I provide
More informationGenerating Images with Perceptual Similarity Metrics based on Deep Networks
Generating Images with Perceptual Similarity Metrics based on Deep Networks Alexey Dosovitskiy and Thomas Brox University of Freiburg {dosovits, brox}@cs.uni-freiburg.de Abstract We propose a class of
More informationPhotorealistic Facial Expression Synthesis by the Conditional Difference Adversarial Autoencoder
Photorealistic Facial Expression Synthesis by the Conditional Difference Adversarial Autoencoder Yuqian ZHOU, Bertram Emil SHI Department of Electronic and Computer Engineering The Hong Kong University
More informationStacked Denoising Autoencoders for Face Pose Normalization
Stacked Denoising Autoencoders for Face Pose Normalization Yoonseop Kang 1, Kang-Tae Lee 2,JihyunEun 2, Sung Eun Park 2 and Seungjin Choi 1 1 Department of Computer Science and Engineering Pohang University
More informationUnsupervised Image-to-Image Translation Networks
Unsupervised Image-to-Image Translation Networks Ming-Yu Liu, Thomas Breuel, Jan Kautz NVIDIA {mingyul,tbreuel,jkautz}@nvidia.com Abstract Unsupervised image-to-image translation aims at learning a joint
More informationarxiv: v1 [cs.cv] 21 Jan 2018
ecoupled Learning for Conditional Networks Zhifei Zhang, Yang Song, and Hairong Qi University of Tennessee {zzhang61, ysong18, hqi}@utk.edu arxiv:1801.06790v1 [cs.cv] 21 Jan 2018 Abstract Incorporating
More informationOne Network to Solve Them All Solving Linear Inverse Problems using Deep Projection Models
One Network to Solve Them All Solving Linear Inverse Problems using Deep Projection Models [Supplemental Materials] 1. Network Architecture b ref b ref +1 We now describe the architecture of the networks
More informationDeep Generative Models Variational Autoencoders
Deep Generative Models Variational Autoencoders Sudeshna Sarkar 5 April 2017 Generative Nets Generative models that represent probability distributions over multiple variables in some way. Directed Generative
More informationUnsupervised Holistic Image Generation from Key Local Patches
Unsupervised Holistic Image Generation from Key Local Patches Donghoon Lee 1, Sangdoo Yun 2, Sungjoon Choi 1, Hwiyeon Yoo 1, Ming-Hsuan Yang 3,4, and Songhwai Oh 1 1 Electrical and Computer Engineering
More informationInverting The Generator Of A Generative Adversarial Network
1 Inverting The Generator Of A Generative Adversarial Network Antonia Creswell and Anil A Bharath, Imperial College London arxiv:1802.05701v1 [cs.cv] 15 Feb 2018 Abstract Generative adversarial networks
More informationProgressive Generative Hashing for Image Retrieval
Progressive Generative Hashing for Image Retrieval Yuqing Ma, Yue He, Fan Ding, Sheng Hu, Jun Li, Xianglong Liu 2018.7.16 01 BACKGROUND the NNS problem in big data 02 RELATED WORK Generative adversarial
More informationarxiv: v3 [cs.cv] 13 Jul 2017
Semantic Image Inpainting with Deep Generative Models Raymond A. Yeh, Chen Chen, Teck Yian Lim, Alexander G. Schwing, Mark Hasegawa-Johnson, Minh N. Do University of Illinois at Urbana-Champaign {yeh17,
More informationMachine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,
Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image
More informationarxiv: v1 [cs.cv] 24 Apr 2018
Mask-aware Photorealistic Face Attribute Manipulation Ruoqi Sun 1, Chen Huang 2, Jianping Shi 3, Lizhuang Ma 1 1 Shanghai Jiao Tong University 2 Carnegie Mellon University 3 Beijing Sensetime Tech. Dev.
More informationarxiv: v1 [stat.ml] 14 Sep 2017
The Conditional Analogy GAN: Swapping Fashion Articles on People Images Nikolay Jetchev Zalando Research nikolay.jetchev@zalando.de Urs Bergmann Zalando Research urs.bergmann@zalando.de arxiv:1709.04695v1
More informationGENERATIVE ADVERSARIAL NETWORK-BASED VIR-
GENERATIVE ADVERSARIAL NETWORK-BASED VIR- TUAL TRY-ON WITH CLOTHING REGION Shizuma Kubo, Yusuke Iwasawa, and Yutaka Matsuo The University of Tokyo Bunkyo-ku, Japan {kubo, iwasawa, matsuo}@weblab.t.u-tokyo.ac.jp
More informationLearning to Discover Cross-Domain Relations with Generative Adversarial Networks
OUTPUT OUTPUT Learning to Discover Cross-Domain Relations with Generative Adversarial Networks Taeksoo Kim 1 Moonsu Cha 1 Hyunsoo Kim 1 Jung Kwon Lee 1 Jiwon Kim 1 Abstract While humans easily recognize
More informationarxiv: v1 [cs.cv] 6 Sep 2018
arxiv:1809.01890v1 [cs.cv] 6 Sep 2018 Full-body High-resolution Anime Generation with Progressive Structure-conditional Generative Adversarial Networks Koichi Hamada, Kentaro Tachibana, Tianqi Li, Hiroto
More informationChannel Locality Block: A Variant of Squeeze-and-Excitation
Channel Locality Block: A Variant of Squeeze-and-Excitation 1 st Huayu Li Northern Arizona University Flagstaff, United State Northern Arizona University hl459@nau.edu arxiv:1901.01493v1 [cs.lg] 6 Jan
More informationarxiv: v1 [cs.cv] 8 May 2018
arxiv:1805.03189v1 [cs.cv] 8 May 2018 Learning image-to-image translation using paired and unpaired training samples Soumya Tripathy 1, Juho Kannala 2, and Esa Rahtu 1 1 Tampere University of Technology
More informationDynamic Routing Between Capsules
Report Explainable Machine Learning Dynamic Routing Between Capsules Author: Michael Dorkenwald Supervisor: Dr. Ullrich Köthe 28. Juni 2018 Inhaltsverzeichnis 1 Introduction 2 2 Motivation 2 3 CapusleNet
More informationTwo Routes for Image to Image Translation: Rule based vs. Learning based. Minglun Gong, Memorial Univ. Collaboration with Mr.
Two Routes for Image to Image Translation: Rule based vs. Learning based Minglun Gong, Memorial Univ. Collaboration with Mr. Zili Yi Introduction A brief history of image processing Image to Image translation
More informationMoonRiver: Deep Neural Network in C++
MoonRiver: Deep Neural Network in C++ Chung-Yi Weng Computer Science & Engineering University of Washington chungyi@cs.washington.edu Abstract Artificial intelligence resurges with its dramatic improvement
More informationLearning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li
Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,
More informationDeep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks
Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin
More informationNeural Face Editing with Intrinsic Image Disentangling SUPPLEMENTARY MATERIAL
Neural Face Editing with Intrinsic Image Disentangling SUPPLEMENTARY MATERIAL Zhixin Shu 1 Ersin Yumer 2 Sunil Hadap 2 Kalyan Sunkavalli 2 Eli Shechtman 2 Dimitris Samaras 1,3 1 Stony Brook University
More informationarxiv: v12 [cs.cv] 10 Jun 2018
arxiv:1711.06491v12 [cs.cv] 10 Jun 2018 High-Resolution Deep Convolutional Generative Adversarial Networks J. D. Curtó,1,2,3,4, I. C. Zarza,1,2,3,4, F. De La Torre 2, I. King 1, and M. R. Lyu 1 1 Dept.
More informationWeakly Supervised Facial Attribute Manipulation via Deep Adversarial Network
2018 IEEE Winter Conference on Applications of Computer Vision Weakly Supervised Facial Attribute Manipulation via Deep Adversarial Network Yilin Wang 1 Suhang Wang 1 Guojun Qi 2 Jiliang Tang 3 Baoxin
More informationAutoencoders. Stephen Scott. Introduction. Basic Idea. Stacked AE. Denoising AE. Sparse AE. Contractive AE. Variational AE GAN.
Stacked Denoising Sparse Variational (Adapted from Paul Quint and Ian Goodfellow) Stacked Denoising Sparse Variational Autoencoding is training a network to replicate its input to its output Applications:
More informationarxiv: v1 [cs.cv] 31 Mar 2016
Object Boundary Guided Semantic Segmentation Qin Huang, Chunyang Xia, Wenchao Zheng, Yuhang Song, Hao Xu and C.-C. Jay Kuo arxiv:1603.09742v1 [cs.cv] 31 Mar 2016 University of Southern California Abstract.
More informationImproved Face Detection and Alignment using Cascade Deep Convolutional Network
Improved Face Detection and Alignment using Cascade Deep Convolutional Network Weilin Cong, Sanyuan Zhao, Hui Tian, and Jianbing Shen Beijing Key Laboratory of Intelligent Information Technology, School
More informationDisguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network. Nathan Sun CIS601
Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network Nathan Sun CIS601 Introduction Face ID is complicated by alterations to an individual s appearance Beard,
More informationarxiv: v1 [cs.cv] 21 Jun 2017
Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction arxiv:1706.07036v1 [cs.cv] 21 Jun 2017 Chen-Hsuan Lin, Chen Kong, Simon Lucey The Robotics Institute Carnegie Mellon University
More information