Improving Semantic Style Transfer Using Guided Gram Matrices

Size: px
Start display at page:

Download "Improving Semantic Style Transfer Using Guided Gram Matrices"

Transcription

1 Improving Semantic Style Transfer Using Guided Gram Matrices Chung Nicolas 1,2, Rong Xie 1,2, Li Song 1,2, and Wenjun Zhang 1,2 1 Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai, China 2 Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai China nicolas.chung@insa-lyon.fr, {xierong, song li, zhangwenjun}@sjtu.edu.cn Abstract. Style transfer is a computer vision task that attempts to transfer the style of an artistic image to a content image. Thanks to the advance in Deep Convolutional Neural Networks, exciting style transfer results has been achieved, but traditional algorithms do not fully understand semantic information. Those algorithms are not aware of which regions in the style image have to be transferred to which regions in the content image. A common failure case is style transfer involving landscape images. After stylization, the textures and colors of the land are often found in incoherent places such as in the river or in the sky. In this work, we investigate semantic style transfer for content images with more than 2 semantic regions. We combine guided Gram matrices with gradient capping and multi-scale representations. Our approach simplifies the parameter tuning problem, improves the style transfer results and is faster than current semantic methods. Keywords: semantic style transfer guided gram matrices gradient capping multi-scale representation deep learning. 1 Introduction Much study in the past three years has focus on style transfer using Deep Convolutional Neural Networks. Style transfer on images aims to transfer the style of an artistic image to a content image. It has been widely investigated - in the field of computer vision and graphics - to solve problems such as: headshot portrait, weather transfer, texture synthesis and object transfiguration. In order to produce good results, style transfer algorithms must preserve the content features of the content image while changing important style features such as texture, colors and line stokes. One way to address that problem is to solve an image optimization problem. Gatys et al achieve astonishing results by considering a 19-layers VGG network pre-trained on image classification [7]. The objective to minimize is composed of a content loss and a style loss. The stylized images are obtained by performing gradient descent on white noisy images. Image optimization methods are flexible

2 2 N. Chung et al. Fig. 1. Style: Rain princess, Leonid Afremov, An example of poor style transfer. Traditional methods consider the content image as a whole. These algorithms cannot distinguish the background and the object of interest (the cat here). In this case, too much content information is lost. A better style/content weight ratio need to be used. The process can be long and tedious. but computationally expensive. To overcome that problem, Justin et al introduce a style transfer network based on model optimization [6]. Stylized images are generated 3 times faster, but the model has lost flexibility: a model has to be trained for each style images. However, although traditional models (whether based on image optimization or model optimization) can achieve exciting results, more work is needed to exploit semantic information. Traditional models apply the style on the whole image. If the content and style weight are not well chosen, the results are poor (see Fig. 1). Semantic style transfer aims to spatially control neural style transfer. With semantic style transfer, the user selects which style to apply to each region in the content image. In [13], the authors attempt to mask the gradient using a simple threshold, but they were not completely successful. In [4], the authors try to combine capped gradient with Gram matrices manipulation, but the results are far from perfect. Recently, the authors in [9] introduce guided Gram matrices and have shown good results for simple content images. In this study, we show how simple modifications of the original guided Gram matrices [9] improve the results of semantic style transfer. Our work concentrates on content images with more than 2 semantics regions. We simplify the parameter tunning problem by considering pixel ratio of the segmentation masks. By doing so, our approach can represent the different styles uniformly. As a second contribution, we demonstrate that spatial control can be reinforced by combining Gram matrices with gradient capping. During backpropagation, the idea is to update each region in the stylized image by stopping the gradient of the other regions. We also introduce a multi-scale semantic style transfer algorithm. Our model produces qualitatively better results and is faster than current semantic style transfer methods. The rest of this paper is organized as follow. We summarize the current existing work in section 2. We briefly introduce the neural style transfer algorithm in section 2. We present our proposed model in section 4. We detail the experiment in section 5. We show and analyze the results in section 6. We draw conclusion in section 7.

3 Improving Semantic Style Transfer Using Guided Gram Matrices 3 2 Related Work Style transfer. Traditional style transfer methods solve an image optimisation problem. Gatys et al demonstrate that content and style features can be extracted directly from the hidden activation layers of the VGG [7]. Extracted style features are then used to compute features correlation known as Gram matrices. Alternative methods replace Gram matrices with a Markov random fields (MRFs) regularizer [3,10]. MRFs are patch-based approaches that improve the precision of photo-realistic style transfer. In some cases, the colors of the content image should be preserved. Authors in [9] accomplish color preservation by implementing two simple linear methods. In [13] a better per-layer content and style weighting scheme is investigated. The same authors demonstrate that more style layers leads to better results. In [14], an histogram loss is used to improve stability of style transfer. These methods solve an image optimization problem: the quality of the resulting images is high but generating them is time-consuming. Moreover, these methods do not exploit semantic information. Fast style transfer. Image optimization models are slow. To speed up style transfer, Justin et al propose a model optimization problem [6]. They formulate the style transfer problem as an image transformation task. A feed-forward transformation network is trained. The per-pixel loss functions are replaced by perceptual losses. Compared to image optimization problems, stylized images are generated three times faster. It was found that generated images can be qualitatively improved using instance normalization [15]. Other methods like [5,12] developed a single network that can represent several styles. Even though these methods are faster than image optimization models, model optimization models cannot perform semantic style transfer. Those models lack flexibility: a model would need to be trained for each content images and combination of style images. Semantic style transfer. The aim of semantic style transfer is to spatially control style transfer. In [9], the authors introduce Guided Gram Matrices. Unlike full style transfer, a Gram Matrix is computed for each region in the content image. The authors obtain good results for content images with 2 semantic regions. In [11] the authors address the computation bottleneck of back-propagation of semantic style transfer with a decoder network. Other works attempt to perform semantic style transfer using patch based methods [3,17]. All these methods have in common that they consider 1 content image and 1 style image with similar semantic regions. Those methods perform well for portrait style transfer. In this work, we consider 1 content image and n style images. It is harder since each style images does not necessarily have the same semantics as the content image. We demonstrate the efficiency of our method for content images with more than 2 semantic regions.

4 4 N. Chung et al. 3 Neural style transfer In this section, we briefly review the style transfer method proposed by Gatys et al [7]. Our work solves the same image optimization problem but combines several reconstruction losses. We review them below. Content and style losses. Gatys et al generate stylized images by minimizing a content and style loss [7]. We denote C the content image, S the style image, I the generated image and F l the activation map of convolutional layer l. The content of an image is represented by the activation of one convolutional layer lc. The content loss compares the content representation of C and I: L content = F l (C) F l (I) 2 (1) Statistical features known as Gram Matrices represent the style of I [7]. When those statistics are computed for each region r in I, they are known as guided Gram Matrices [9]. Guided Gram matrices require semantic segmentation masks. Contrary to content features, multiple layers represent the style of an image. Let G l r be the guided Gram Matrices of the r-th region at layer l and β r its associated weighting factor. The style loss is: L style = β r G l r(s) G l r(i) 2 (2) r l Regularizer Loss. Justin et al [6] show that a regularizer loss improve results for fast style transfer methods. It is a per pixel loss function that encourages spatial smoothness by reducing distortions and visual artifacts. It is defined as [16]: L regularizer = ((x i,j+1 x i,j ) 2 + (x i+1,j x i,j ) 2 ) 1/2 (3) i,j 4 Proposed Model Our algorithm takes as input a content image (an ordinary photograph), n segmentation masks (each mask is associated to a region in the content image) and n style images. Fig. 2 illustrates our method. 4.1 Loss function We generate stylized images by jointly minimizing a content, style and regularizer loss. Refer to the previous section for the detail of each loss. We control the tradeoff between content, style and regularizer by the weighting factors α, β and γ. As in [9], we used 19-layers VGG network pre-trained on image classification to extract the high levels features. L total = αl content + βl style + γl regularizer (4)

5 Improving Semantic Style Transfer Using Guided Gram Matrices 5 (a) Step 1: Features extraction (b) Step 2: Image update (c) Step 3: Next scale Fig. 2. System overview. (a) Features extraction. We input the styles, masks, content and initialize the optimisation with the content. We extract content and style features and compute their associated loss. (b) Image update. We cap the gradient and update the stylized image. We repeat step (a) and (b) N times. (c) Next scale. We upscale the current stylized image and set it as initialisation for the next scale. We repeat the whole process until the final resolution is reached. 4.2 Improved Guided Gram Matrices Auto-tuning of weights. To understand our auto-tuning approach, we have to understand how guided Gram Matrices are computed. Consider an image X with R number of regions. Let G l r and T l r, respectively, be the guided Gram Matrix and the guidance channel at layer l for r-th region, r [0, R]. Let F l r be the activations maps masked by T l r. We denote the element-wise multiplication operator. G l r is the inner product between the F l r. F l r(x) = T l r F l (X) G l r(x) = F l r(x) T F l r(x) (5a) (5b) The main difficulty to obtain good results comes from parameters tuning. A content image composed of R semantic regions introduces R more parameters. We simplify that problem by tuning β r (defined in Eq. 2) automatically. For a region r in C, we compute β r as the ratio between the number of pixels in that region and the total number of pixels: β r = p r i p i where p i is the number of pixel of region i. Because of masking (Eq. 5a), bigger regions contribute more to the style loss than smaller regions (more pixels). (6)

6 6 N. Chung et al. To address that problem, we tried to normalize the guidance channels such that i (T r) l 2 i = 1 [9], but by doing so, we found out that the style of the smaller regions was more marked than the bigger ones. Our weighting factor counterbalance that phenomena by assigning smaller weights to smaller regions. Erosion/Dilation. Recall that style features are represented with several layers. Because neurons in deeper layers have bigger receptive field, the styles of adjacent regions overlap at the boundaries. Authors in [8] attempt to increase spatial control by using erosion and dilation on the guidance channels. The results are good but the numbers and the shapes of the considered regions are simple: only two large regions. We found out that results for small regions are poor. For small regions, only few neurons capture style features (see Eq. 5a). After erosion the remaining neurons are not enough to well represent the style. We propose a method that could improve the blending problem but that we did not implement yet. Our idea is to define a zone of pixels at the boundaries using morphological filtering. The weight of two-adjacent style are set to 0.5 and 0.5. For n overlapping stylized regions, the weights are set to 1/n. The aim is to smooth transition between the different stylized regions. 4.3 Gradient Capping In the previous subsection, we saw that the styles overlap at the boundaries. To address that problem, we combine guided Gram Matrices with gradient capping. We ensure that only the desired regions are updated by stopping the propagation of the gradient through the unwanted regions. Consider Xr t the r-th region of the stylized image X at iteration t, T r the associated segmentation mask, λ the learning rate, the gradient and L tot the loss function defined in Eq. 4. At each iteration we have: Xr t+1 = Xr t λ T r X tl tot (7) Note that guided Gram Matrices mask style features. To match the dimension of those features (see Eq. 5a), the masks T r have to be downsampled with pooling. Gradient capping masks regions in the stylized image itself (the input masks T r are enough). With gradient capping, the transition between objects is sharp at the boundaries. We solve this problem by increasing the regularization weight γ (see Eq. 4). As suggested in [4], we tried to independently minimize the content, style and regularizer loss, but the results were bad. The reason is that the minimun of a sum of functions is not equal to the sum of the minimun of the functions. This property can be easily verified by considering functions with positive and negative numbers such as the gradient. Thus, we have to jointly minimize the three losses. 4.4 Multi-scale representation In order to generate high resolution stylized images, multi-scale representation has been used in traditional style transfer methods [7,14]. The first step is to

7 Improving Semantic Style Transfer Using Guided Gram Matrices 7 Algorithm 1: Multi scale semantic style transfer. X 0 indicates image X at resolution α 0, N number of scales, upscaling, downscaling. input : Content image C Style images S Masks content T c Masks styles T s output: Stylized image I k at scale α k begin /* Initialization with the smallest scale */ C 0, S 0, Tc 0, Ts 0 C, S, T c, T s α 0 I 0 C 0 /* Multi-scale representation */ for k = 0,..., N 1 do /* Image optimization */ I k semantic-style-transfer(c k, S k, Tc k, Ts k, I k ) /* Initialization next scale */ I k+1 I k α k+1 /* Upscale content image, style images and masks */ C k+1, S k+1, Tc k+1, Ts k+1 C k, S k, Tc k, Ts k α k+1 generate a low resolution image. This stylized image is then upscaled and set as the initialization image for the next scale. The process is then repeated until the final resolution is reached. By doing so, details are added between each successive scale and fewer iterations are required at high resolution. We extend this multi-scale representation to semantic style transfer. In addition to content and style images, our algorithms requires the segmentation masks. We upscale images with bilinear interpolation. Refer to Alg. 1 for the details. Even though multi-scale representation requires less iterations at high resolution, it is not always faster than one-scale representation. There exists a prepocessing phase that is not negligeable. We empirically found that using 2 scales is a good trade-off between speed and quality of the generated images. 5 Experiment details The number of iteration was fixed to N = 700. Content features are extracted from layer ReLU4 2. Style features are extracted from layer ReLU1 1, ReLU2 1, ReLU3 1, ReLU4 1, ReLU5 1. The optimization process is initialized from the content image (cleaner results than white noise initialization). We use the content masks as style masks. Adam optimizer is used with learning rate 1e+1. The content weight is 5e+0. The style weight is 5e+2. When not mentioned, we used the geometric weighting scheme defined in [13]. The strength of the total variation loss is 1e+3. Our implementation is based on the work of [1]. We plan

8 8 N. Chung et al. Table 1. Running time comparison in second (s) of different semantic methods. Time is averaged over 20 samples. N = 700 iterations, r : number of semantic regions. A content image was used. Number regions Castillo et al [2] Gatys et al [9] Our Speed up vs. [2] vs. [9] r = r = r = to render our code publicly available upon acceptance for future research. On two NVIDIA k20c, it takes around 1 minute to process one frame of size Results and discussion 6.1 Proposed vs Others semantic methods In this section, we qualitatively compare targeted style transfer [2], guided Gram Matrices [9] and our method. Two semantic regions are considered and the style is applied only on the background. The results are presented in Fig. 3. With [2], the foreground is the same as in the content image. This method applies style transfer on the whole content and then segments the stylized regions. This method is simple but it cannot control which regions in the style images has to be transferred to which regions in the content image. Our method and [9] produce close imitation of the foreground but ours better blends regions. In Fig. 3 (a), the background between the arms and the lower body of the person riding the bike should be orange but is somewhat greyish with [9]. With ours, the background stays orange. In Fig. 3 (b) with [9] the semantic style transfer failed around the head and the tail of the horse (because of erosion). With ours it does not. For [2], stylized regions blend naturally but another optimization problem is solved to avoid crude results. 6.2 Mutiple semantic regions One of the main drawbacks of targeted transfer [2] is that the running time increases linearly with the number of semantic regions. When R styles are applied, [2] solve the style transfer problem R times. Our method and [9] solve the problem only once whatever value of R. A running time comparison is shown in Tab. 1. It takes 92s to [2] to produce a stylized image with 2 semantic regions and 185s with 4 semantic regions: the time complexity is linear. With ours and [9], the running time increases approximately by 10 seconds for each additional regions. Ours is the fastest and beats [9] by seconds. We show some generated images with our method in Fig. 4.

9 Improving Semantic Style Transfer Using Guided Gram Matrices One-scale vs Multi-scale We compare here one-scale vs multi-scale methods. We used 700 iterations for both methods. We used the weighting scheme in [1] (better results in this case). Results with two-scale are presented in Fig. 5 and Fig. 6. We also report quantitative results with more scale in Tab. 2. Our two-scale algorithm Fig. 5h produces visually more pleasing results than one-scale method Fig. 5g. Even after 700 iterations, the one-scale method does not well capture the style for the soil: this problem is known as ghosting. Our two-scale method does not show those instabilities. Loss curves are shown in Fig. 7. Each optimisation begins with a preprocessing phase. During this phase, we extract the content features from the content image and the style features from the style images. This phase is longer for higher resolution images (10s for image, 20s for image). Increasing iterations is not efficient to reduce the style loss. With one-scale, the style loss remains almost constant after 330 iterations (50s). After optimisation, both methods present roughly the same content (2e+6) and style loss (1e+6). However, care has to be taken at the analysis of those values. As it can be seen in Fig. 5, the style loss fails to take ghosting instabilities into account. The two-scale method generate images faster than the one-scale one (68s compared to 83s). Using more scales is sligthly faster but the quantity of the generated images is worse (see Tab. 2). 6.4 Ablation study To study which part of our model is effective we perform an ablation study. Results are shown in Fig. 8. Fig. 8a demonstrates that not using regularization results in a noisy image. It can be seen in Fig. 8b that with guided Gram matrices alone, the semantics are not well respected (especially around the shoes). Autotuning halves the style loss by 2 (Fig. 8c). It also reduces the content loss. With Gradient capping the semantics are respected (smaller content loss) but the countours are visible. It explains why the style loss is higher (see around the body and ear of the cat in Fig. 8d). Combining Gram matrices, auto-tuning and gradient capping offers a good trade off between content and style loss (Fig. 8e). We can observe in Fig. 8f that blending at the boundaries can be improved by increasing the regularization weight. 7 Conclusion In this work we proposed a semantic style transfer that can process content images with more than 2 semantic regions. We have shown how simple modifications of the original guided Gram matrices can simplify the parameter tunning problem and improve the style transfer results. This study indicates that gradient capping is an efficient solution to spatial control of neural style transfer. We further proposed a multi-scale algorithm that generates high quality images

10 10 N. Chung et al. faster than single scale methods. In current implementation, our method requires the segmentation masks of both the style and content images. We are working on developing a better solution that could perform both segmentation and style transfer at the same time. Acknowledgment This work was supported by NSFC ( and ) and the Shanghai Key Laboratory of Digital Media Processing and Transmissions. References 1. Athalye, A.: Neural style. github.com/anishathalye/neural-style (2015) 2. Castillo, C., De, S., Han, X., Singh, B., Yadav, A., Goldstein, T.: Son of zorns lemma: Targeted style transfer using instance-aware semantic segmentation. arxiv: (2017) 3. Champandard, A.J.: Semantic style transfer and turning two-bit doodles into fine artwork. arxiv: (2016) 4. Chan, E., Bhargava, R.: Show, divide and neural: Weighted style transfer. CS231n: Convolutional Neural Networks for Visual Recognition, Project Reports (2016) 5. Dumoulin, V., Shlens, J., Kudlur, M.: A learned representation for artistic style. arxiv: (2017) 6. Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. arxiv: (2016) 7. Leon, A., Alexander, S., Matthias, B.: Image style transfer using convolutional neural networks. arxiv: (2015) 8. Leon, A., Alexander, S., Matthias, B., Hertzmann, A., Shechtman, E.: Supplementary material: Controlling perceptual factors in neural style transfer. bethgelab.org (2016) 9. Leon, A., Alexander, S., Matthias, B., Hertzmann, A., Shechtman, E.: Controlling perceptual factors in neural style transfer. arxiv: (2017) 10. Li, C., Wand, M.: Combining markov random fields and convolutional neural networks for image synthesis. arxiv: (2016) 11. Lu, M., Zhao, H.L., Yao, A., Xu, F., Chen, Y., Zhang, L.: Decoder network over lightweight reconstructed feature for fast semantic style transfer. The IEEE International Conference on Computer Vision (ICCV) (2017) 12. Mahendran, A., Vedaldi, A.: Multi-style generative network for real-time transfer. arxiv: (2017) 13. Novak, R., Nikulin, Y.: Improving the neural algorithm of artistic style. arxiv: (2016) 14. Risser, E., P., W., Barnes, C.: Stable and controllable neural texture synthesis and style transfer using histogram losses. arxiv: (2017) 15. Ulyanov, D., Vedaldi, A., Lepitsky, V.: Instance normalization: The missing ingredient for fast stylization. arxiv: (2017) 16. Zhang, H., Dana, K.: Understanding deep image representation by inverting them. arxiv: (2017) 17. Zhao, H., Rosin, P.L., Lai, Y.: Automatic semantic style transfer using deep convolutional neural networks and soft masks. arxiv: (2017)

11 Improving Semantic Style Transfer Using Guided Gram Matrices 11 (a) Motorbike (b) Horse Fig. 3. Targeted transfer [2] vs Guided Gram matrices [9] vs Ours. Results obtained with [2] are pleasing, but ours offers more spatial control. Compared to [9] our method gives better blending results. With [9], the background is not consitent in (a) around the motorbike and in (b) around the head and the tail of the horse. Content images were extracted from the DAVIS dataset and were resized to pixels.

12 12 N. Chung et al. (a) Mask (b) Style 1 (c) Style 2 (e) Content (g) Mask (h) Style 1 (l) Content (d) Style 3 (f) 3 styles combination (i) Style 2 (j) Style 3 (k) Style 4 (m) 4 styles combination Fig. 4. Our results for (f) 3, (m) 4 semantic regions. Numbers in the masks correspond to the style number which is applied. Mixing styles with the same predominant colors looks good ((m) yellow/orange). Content and maks are from the COCO dataset.

13 Improving Semantic Style Transfer Using Guided Gram Matrices (a) Content (b) Mask (g) One-scale (c) Style 1 (d) Style 2 (e) Style 3 13 (f) Style 4 (h) Two-scale Fig. 5. (g) One-scale transfer, (h) Two-scale transfer. We used 700 iterations for both methods. We detailed the 2-scale process in Fig. 6. Multi-scaling generates a more pleasing image: with (g) ghosting instabilities can be seen in the soil. The perceptual loss curves are shown in Fig. 7. Fig. 6. Multi-scale transfer: 500 iterations (it) are performed at low resolution ( ). The resulting image is then upscaled and set as initialisation for the next scale ( ). We performed 200 more iterations. We report quantitative results for different scale in Tab. 2.

14 14 N. Chung et al. Table 2. Comparison of content loss, style loss and running time for different scale. The image used is the one in Fig. 5. The total number of iterations is fixed to N = 700. Values are averaged over 20 samples. For each successive scale, we halves the number of iterations by approximately 2. Using 2 scales offers a good compromise between speed and quality of the generated images. Number scales Iterations per scale Loss (1e+6) Time Content Style (s) PP = preprocessing One-scale Two-scale PP = preprocessing One-scale Two-scale content loss style loss PP 500 it PP 200 it 10 5 PP 500 it PP 200 it PP 700 it time (s) (a) Content loss 10 4 PP 700 it time (s) (b) Style loss Fig. 7. Perceptual loss curves for 1-scale vs 2-scale representation. Visual results are shown in Fig. 5. With a same number of iterations (700 it), our 2-scale algorithm generates the stylized image faster (68s compared to 83s). The final content and style loss are roughly the same for both methods but multi-scaling reduces ghosting instabilities (see Fig. 5).

15 Improving Semantic Style Transfer Using Guided Gram Matrices 15 (a) Gram matrices (γ = 0) 2.79 / 2.42 (b) Gram matrices (γ = 1e+2) 2.79 / 2.43 (c) Auto-tuning + Fig. 8b 2.24 / 1.32 (d) Gradient capping + Fig. 8b 2.19 / 2.81 (e) Final (γ = 1e+2) 2.07 / 1.42 (f) Final (γ = 1e+3) 1.92 / 1.39 Fig. 8. Ablation study. The image used is the same one as in Fig. 4e. γ is the regularization weight defined in. Eq. 4. We report content / style loss (1e+6) below each image. Zoom in for details, especially around the right side of the shoes.

Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization. Presented by: Karen Lucknavalai and Alexandr Kuznetsov

Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization. Presented by: Karen Lucknavalai and Alexandr Kuznetsov Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization Presented by: Karen Lucknavalai and Alexandr Kuznetsov Example Style Content Result Motivation Transforming content of an image

More information

Exploring Style Transfer: Extensions to Neural Style Transfer

Exploring Style Transfer: Extensions to Neural Style Transfer Exploring Style Transfer: Extensions to Neural Style Transfer Noah Makow Stanford University nmakow@stanford.edu Pablo Hernandez Stanford University pabloh2@stanford.edu Abstract Recent work by Gatys et

More information

SON OF ZORN S LEMMA: TARGETED STYLE TRANSFER USING INSTANCE-AWARE SEMANTIC SEGMENTATION

SON OF ZORN S LEMMA: TARGETED STYLE TRANSFER USING INSTANCE-AWARE SEMANTIC SEGMENTATION SON OF ZORN S LEMMA: TARGETED STYLE TRANSFER USING INSTANCE-AWARE SEMANTIC SEGMENTATION Carlos Castillo, Soham De, Xintong Han, Bharat Singh, Abhay Kumar Yadav, and Tom Goldstein Department of Computer

More information

Neural style transfer

Neural style transfer 1/32 Neural style transfer Victor Kitov v.v.kitov@yandex.ru 2/32 Neural style transfer Input: content image, style image. Style transfer - application of artistic style from style image to content image.

More information

CS 229 Final Report: Artistic Style Transfer for Face Portraits

CS 229 Final Report: Artistic Style Transfer for Face Portraits CS 229 Final Report: Artistic Style Transfer for Face Portraits Daniel Hsu, Marcus Pan, Chen Zhu {dwhsu, mpanj, chen0908}@stanford.edu Dec 16, 2016 1 Introduction The goal of our project is to learn the

More information

Supplemental Document for Deep Photo Style Transfer

Supplemental Document for Deep Photo Style Transfer Supplemental Document for Deep Photo Style Transfer Fujun Luan Cornell University Sylvain Paris Adobe Eli Shechtman Adobe Kavita Bala Cornell University fujun@cs.cornell.edu sparis@adobe.com elishe@adobe.com

More information

A Neural Algorithm of Artistic Style. Leon A. Gatys, Alexander S. Ecker, Matthias Bethge

A Neural Algorithm of Artistic Style. Leon A. Gatys, Alexander S. Ecker, Matthias Bethge A Neural Algorithm of Artistic Style Leon A. Gatys, Alexander S. Ecker, Matthias Bethge Presented by Shishir Mathur (1 Sept 2016) What is this paper This is the research paper behind Prisma It creates

More information

Convolutional Neural Networks + Neural Style Transfer. Justin Johnson 2/1/2017

Convolutional Neural Networks + Neural Style Transfer. Justin Johnson 2/1/2017 Convolutional Neural Networks + Neural Style Transfer Justin Johnson 2/1/2017 Outline Convolutional Neural Networks Convolution Pooling Feature Visualization Neural Style Transfer Feature Inversion Texture

More information

GLStyleNet: Higher Quality Style Transfer Combining Global and Local Pyramid Features

GLStyleNet: Higher Quality Style Transfer Combining Global and Local Pyramid Features GLStyleNet: Higher Quality Style Transfer Combining Global and Local Pyramid Features Zhizhong Wang*, Lei Zhao*, Wei Xing, Dongming Lu College of Computer Science and Technology, Zhejiang University {endywon,

More information

Decoder Network over Lightweight Reconstructed Feature for Fast Semantic Style Transfer

Decoder Network over Lightweight Reconstructed Feature for Fast Semantic Style Transfer Decoder Network over Lightweight Reconstructed Feature for Fast Semantic Style Transfer Ming Lu 1, Hao Zhao 1, Anbang Yao 2, Feng Xu 3, Yurong Chen 2, and Li Zhang 1 1 Department of Electronic Engineering,

More information

CS231N Project Final Report - Fast Mixed Style Transfer

CS231N Project Final Report - Fast Mixed Style Transfer CS231N Project Final Report - Fast Mixed Style Transfer Xueyuan Mei Stanford University Computer Science xmei9@stanford.edu Fabian Chan Stanford University Computer Science fabianc@stanford.edu Tianchang

More information

Spatial Control in Neural Style Transfer

Spatial Control in Neural Style Transfer Spatial Control in Neural Style Transfer Tom Henighan Stanford Physics henighan@stanford.edu Abstract Recent studies have shown that convolutional neural networks (convnets) can be used to transfer style

More information

Localized Style Transfer

Localized Style Transfer Localized Style Transfer Alex Wells Stanford University awells2@stanford.edu Jeremy Wood Stanford University jwood3@stanford.edu Minna Xiao Stanford University mxiao26@stanford.edu Abstract Recent groundbreaking

More information

Universal Style Transfer via Feature Transforms

Universal Style Transfer via Feature Transforms Universal Style Transfer via Feature Transforms Yijun Li UC Merced yli62@ucmerced.edu Chen Fang Adobe Research cfang@adobe.com Jimei Yang Adobe Research jimyang@adobe.com Zhaowen Wang Adobe Research zhawang@adobe.com

More information

Image Transformation via Neural Network Inversion

Image Transformation via Neural Network Inversion Image Transformation via Neural Network Inversion Asha Anoosheh Rishi Kapadia Jared Rulison Abstract While prior experiments have shown it is possible to approximately reconstruct inputs to a neural net

More information

Controlling Perceptual Factors in Neural Style Transfer

Controlling Perceptual Factors in Neural Style Transfer Controlling Perceptual Factors in Neural Style Transfer Leon A. Gatys 1 Alexander S. Ecker 1 Matthias Bethge 1 Aaron Hertzmann 2 Eli Shechtman 2 1 University of Tübingen 2 Adobe Research (a) Content (b)

More information

arxiv: v2 [cs.cv] 14 Jul 2018

arxiv: v2 [cs.cv] 14 Jul 2018 Constrained Neural Style Transfer for Decorated Logo Generation arxiv:1803.00686v2 [cs.cv] 14 Jul 2018 Gantugs Atarsaikhan, Brian Kenji Iwana, Seiichi Uchida Graduate School of Information Science and

More information

Transfer Learning. Style Transfer in Deep Learning

Transfer Learning. Style Transfer in Deep Learning Transfer Learning & Style Transfer in Deep Learning 4-DEC-2016 Gal Barzilai, Ram Machlev Deep Learning Seminar School of Electrical Engineering Tel Aviv University Part 1: Transfer Learning in Deep Learning

More information

A Neural Algorithm of Artistic Style. Leon A. Gatys, Alexander S. Ecker, Mattthias Bethge Presented by Weidi Xie (1st Oct 2015 )

A Neural Algorithm of Artistic Style. Leon A. Gatys, Alexander S. Ecker, Mattthias Bethge Presented by Weidi Xie (1st Oct 2015 ) A Neural Algorithm of Artistic Style Leon A. Gatys, Alexander S. Ecker, Mattthias Bethge Presented by Weidi Xie (1st Oct 2015 ) What does the paper do? 2 Create artistic images of high perceptual quality.

More information

INF 5860 Machine learning for image classification. Lecture 11: Visualization Anne Solberg April 4, 2018

INF 5860 Machine learning for image classification. Lecture 11: Visualization Anne Solberg April 4, 2018 INF 5860 Machine learning for image classification Lecture 11: Visualization Anne Solberg April 4, 2018 Reading material The lecture is based on papers: Deep Dream: https://research.googleblog.com/2015/06/inceptionism-goingdeeper-into-neural.html

More information

In-Place Activated BatchNorm for Memory- Optimized Training of DNNs

In-Place Activated BatchNorm for Memory- Optimized Training of DNNs In-Place Activated BatchNorm for Memory- Optimized Training of DNNs Samuel Rota Bulò, Lorenzo Porzi, Peter Kontschieder Mapillary Research Paper: https://arxiv.org/abs/1712.02616 Code: https://github.com/mapillary/inplace_abn

More information

Structured Prediction using Convolutional Neural Networks

Structured Prediction using Convolutional Neural Networks Overview Structured Prediction using Convolutional Neural Networks Bohyung Han bhhan@postech.ac.kr Computer Vision Lab. Convolutional Neural Networks (CNNs) Structured predictions for low level computer

More information

Machine Learning 13. week

Machine Learning 13. week Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of

More information

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group 1D Input, 1D Output target input 2 2D Input, 1D Output: Data Distribution Complexity Imagine many dimensions (data occupies

More information

MetaStyle: Three-Way Trade-Off Among Speed, Flexibility, and Quality in Neural Style Transfer

MetaStyle: Three-Way Trade-Off Among Speed, Flexibility, and Quality in Neural Style Transfer MetaStyle: Three-Way Trade-Off Among Speed, Flexibility, and Quality in Neural Style Transfer Chi Zhang and Yixin Zhu and Song-Chun Zhu {chizhang,yzhu,sczhu}@cara.ai International Center for AI and Robot

More information

Deep Learning for Visual Manipulation and Synthesis

Deep Learning for Visual Manipulation and Synthesis Deep Learning for Visual Manipulation and Synthesis Jun-Yan Zhu 朱俊彦 UC Berkeley 2017/01/11 @ VALSE What is visual manipulation? Image Editing Program input photo User Input result Desired output: stay

More information

arxiv: v1 [cs.cv] 5 Mar 2016 Abstract

arxiv: v1 [cs.cv] 5 Mar 2016 Abstract Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artwork Alex J. Champandard nucl.ai Research Laboratory alexjc@nucl.ai nucl.ai Conference 2016 Artificial Intelligence in Creative Industries

More information

Image Compression: An Artificial Neural Network Approach

Image Compression: An Artificial Neural Network Approach Image Compression: An Artificial Neural Network Approach Anjana B 1, Mrs Shreeja R 2 1 Department of Computer Science and Engineering, Calicut University, Kuttippuram 2 Department of Computer Science and

More information

Image Restoration Using DNN

Image Restoration Using DNN Image Restoration Using DNN Hila Levi & Eran Amar Images were taken from: http://people.tuebingen.mpg.de/burger/neural_denoising/ Agenda Domain Expertise vs. End-to-End optimization Image Denoising and

More information

arxiv: v1 [cs.cv] 17 Nov 2016

arxiv: v1 [cs.cv] 17 Nov 2016 Inverting The Generator Of A Generative Adversarial Network arxiv:1611.05644v1 [cs.cv] 17 Nov 2016 Antonia Creswell BICV Group Bioengineering Imperial College London ac2211@ic.ac.uk Abstract Anil Anthony

More information

Face Sketch Synthesis with Style Transfer using Pyramid Column Feature

Face Sketch Synthesis with Style Transfer using Pyramid Column Feature Face Sketch Synthesis with Style Transfer using Pyramid Column Feature Chaofeng Chen 1, Xiao Tan 2, and Kwan-Yee K. Wong 1 1 The University of Hong Kong, 2 Baidu Research {cfchen, kykwong}@cs.hku.hk, tanxchong@gmail.com

More information

Object Removal Using Exemplar-Based Inpainting

Object Removal Using Exemplar-Based Inpainting CS766 Prof. Dyer Object Removal Using Exemplar-Based Inpainting Ye Hong University of Wisconsin-Madison Fall, 2004 Abstract Two commonly used approaches to fill the gaps after objects are removed from

More information

CENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan

CENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan CENG 783 Special topics in Deep Learning AlchemyAPI Week 11 Sinan Kalkan TRAINING A CNN Fig: http://www.robots.ox.ac.uk/~vgg/practicals/cnn/ Feed-forward pass Note that this is written in terms of the

More information

arxiv: v5 [cs.cv] 16 May 2018

arxiv: v5 [cs.cv] 16 May 2018 Neural Style Transfer Neural Style Transfer: A Review Yongcheng Jing Yezhou Yang Zunlei Feng Jingwen Ye Yizhou Yu Mingli Song arxiv:1705.04058v5 [cs.cv] 16 May 2018 Microsoft Visual Perception Laboratory,

More information

CHAPTER 6 DETECTION OF MASS USING NOVEL SEGMENTATION, GLCM AND NEURAL NETWORKS

CHAPTER 6 DETECTION OF MASS USING NOVEL SEGMENTATION, GLCM AND NEURAL NETWORKS 130 CHAPTER 6 DETECTION OF MASS USING NOVEL SEGMENTATION, GLCM AND NEURAL NETWORKS A mass is defined as a space-occupying lesion seen in more than one projection and it is described by its shapes and margin

More information

Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform. Xintao Wang Ke Yu Chao Dong Chen Change Loy

Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform. Xintao Wang Ke Yu Chao Dong Chen Change Loy Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform Xintao Wang Ke Yu Chao Dong Chen Change Loy Problem enlarge 4 times Low-resolution image High-resolution image Previous

More information

Bilevel Sparse Coding

Bilevel Sparse Coding Adobe Research 345 Park Ave, San Jose, CA Mar 15, 2013 Outline 1 2 The learning model The learning algorithm 3 4 Sparse Modeling Many types of sensory data, e.g., images and audio, are in high-dimensional

More information

COMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017

COMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017 COMP9444 Neural Networks and Deep Learning 7. Image Processing COMP9444 17s2 Image Processing 1 Outline Image Datasets and Tasks Convolution in Detail AlexNet Weight Initialization Batch Normalization

More information

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro CMU 15-781 Lecture 18: Deep learning and Vision: Convolutional neural networks Teacher: Gianni A. Di Caro DEEP, SHALLOW, CONNECTED, SPARSE? Fully connected multi-layer feed-forward perceptrons: More powerful

More information

What was Monet seeing while painting? Translating artworks to photo-realistic images M. Tomei, L. Baraldi, M. Cornia, R. Cucchiara

What was Monet seeing while painting? Translating artworks to photo-realistic images M. Tomei, L. Baraldi, M. Cornia, R. Cucchiara What was Monet seeing while painting? Translating artworks to photo-realistic images M. Tomei, L. Baraldi, M. Cornia, R. Cucchiara COMPUTER VISION IN THE ARTISTIC DOMAIN The effectiveness of Computer Vision

More information

A Deep Learning Framework for Authorship Classification of Paintings

A Deep Learning Framework for Authorship Classification of Paintings A Deep Learning Framework for Authorship Classification of Paintings Kai-Lung Hua ( 花凱龍 ) Dept. of Computer Science and Information Engineering National Taiwan University of Science and Technology Taipei,

More information

Multi-style Transfer: Generalizing Fast Style Transfer to Several Genres

Multi-style Transfer: Generalizing Fast Style Transfer to Several Genres Multi-style Transfer: Generalizing Fast Style Transfer to Several Genres Brandon Cui Stanford University bcui19@stanford.edu Calvin Qi Stanford University calvinqi@stanford.edu Aileen Wang Stanford University

More information

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies http://blog.csdn.net/zouxy09/article/details/8775360 Automatic Colorization of Black and White Images Automatically Adding Sounds To Silent Movies Traditionally this was done by hand with human effort

More information

DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION

DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION Yen-Cheng Liu 1, Wei-Chen Chiu 2, Sheng-De Wang 1, and Yu-Chiang Frank Wang 1 1 Graduate Institute of Electrical Engineering,

More information

One Network to Solve Them All Solving Linear Inverse Problems using Deep Projection Models

One Network to Solve Them All Solving Linear Inverse Problems using Deep Projection Models One Network to Solve Them All Solving Linear Inverse Problems using Deep Projection Models [Supplemental Materials] 1. Network Architecture b ref b ref +1 We now describe the architecture of the networks

More information

Deep Learning with Tensorflow AlexNet

Deep Learning with Tensorflow   AlexNet Machine Learning and Computer Vision Group Deep Learning with Tensorflow http://cvml.ist.ac.at/courses/dlwt_w17/ AlexNet Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification

More information

arxiv: v1 [cs.cv] 22 Feb 2017

arxiv: v1 [cs.cv] 22 Feb 2017 Synthesising Dynamic Textures using Convolutional Neural Networks arxiv:1702.07006v1 [cs.cv] 22 Feb 2017 Christina M. Funke, 1, 2, 3, Leon A. Gatys, 1, 2, 4, Alexander S. Ecker 1, 2, 5 1, 2, 3, 6 and Matthias

More information

Mask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma

Mask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN Background Related Work Architecture Experiment Mask R-CNN Background Related Work Architecture Experiment Background From left

More information

DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION

DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION 2017 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, SEPT. 25 28, 2017, TOKYO, JAPAN DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION Yen-Cheng Liu 1,

More information

Artistic Style Transfer for Videos and Spherical Images

Artistic Style Transfer for Videos and Spherical Images Noname manuscript No. will be inserted by the editor) Artistic Style Transfer for Videos and Spherical Images Manuel Ruder Alexey Dosovitskiy Thomas Brox Received: date / Accepted: date Abstract Manually

More information

Animated Non-Photorealistic Rendering in Multiple Styles

Animated Non-Photorealistic Rendering in Multiple Styles Animated Non-Photorealistic Rendering in Multiple Styles Ting-Yen Chen and Reinhard Klette Department of Computer Science The University of Auckland, New Zealand Abstract. This paper presents an algorithm

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 WRI C225 Lecture 04 130131 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Histogram Equalization Image Filtering Linear

More information

Texture Synthesis Through Convolutional Neural Networks and Spectrum Constraints

Texture Synthesis Through Convolutional Neural Networks and Spectrum Constraints Texture Synthesis Through Convolutional Neural Networks and Spectrum Constraints Gang Liu, Yann Gousseau Telecom-ParisTech, LTCI CNRS 46 Rue Barrault, 75013 Paris, France. {gang.liu, gousseau}@telecom-paristech.fr

More information

An ICA based Approach for Complex Color Scene Text Binarization

An ICA based Approach for Complex Color Scene Text Binarization An ICA based Approach for Complex Color Scene Text Binarization Siddharth Kherada IIIT-Hyderabad, India siddharth.kherada@research.iiit.ac.in Anoop M. Namboodiri IIIT-Hyderabad, India anoop@iiit.ac.in

More information

arxiv: v2 [cs.cv] 11 Sep 2018

arxiv: v2 [cs.cv] 11 Sep 2018 Neural omic tyle Transfer: ase tudy Maciej Pęśko and Tomasz Trzciński Warsaw University of Technology, Warsaw, Poland, mpesko@mion.elka.pw.edu.pl t.trzcinski@ii.pw.edu.pl, arxiv:1809.01726v2 [cs.v] 11

More information

arxiv: v1 [cs.cv] 14 Jun 2017

arxiv: v1 [cs.cv] 14 Jun 2017 Photo-realistic Facial Texture Transfer Parneet Kaur Hang Zhang Kristin Dana arxiv:706.0306v [cs.cv] Jun 207 Department of Electrical and Computer Engineering, Rutgers University, New Brunswick, USA parneet@rutgers.edu,

More information

arxiv: v2 [cs.cv] 12 Feb 2018

arxiv: v2 [cs.cv] 12 Feb 2018 Artistic style transfer for videos and spherical images Manuel Ruder Alexey Dosovitskiy Thomas Brox arxiv:1708.04538v2 [cs.cv] 12 Feb 2018 Abstract Manually re-drawing an image in a certain artistic style

More information

SEMI-BLIND IMAGE RESTORATION USING A LOCAL NEURAL APPROACH

SEMI-BLIND IMAGE RESTORATION USING A LOCAL NEURAL APPROACH SEMI-BLIND IMAGE RESTORATION USING A LOCAL NEURAL APPROACH Ignazio Gallo, Elisabetta Binaghi and Mario Raspanti Universitá degli Studi dell Insubria Varese, Italy email: ignazio.gallo@uninsubria.it ABSTRACT

More information

Supplementary Materials for. A Common Framework for Interactive Texture Transfer

Supplementary Materials for. A Common Framework for Interactive Texture Transfer Supplementary Materials for A Common Framework for Interactive Texture Transfer Yifang Men, Zhouhui Lian, Yingmin Tang, Jianguo Xiao Institute of Computer Science and Technology, Peking University, China

More information

arxiv: v1 [cs.cv] 5 May 2017

arxiv: v1 [cs.cv] 5 May 2017 Characterizing and Improving Stability in Neural Transfer Agrim Gupta, Justin Johnson, Alexandre Alahi, and Li Fei-Fei Department of Computer Science, Stanford University agrim@stanford.edu {jcjohns,alahi,feifeili}@cs.stanford.edu

More information

Using Machine Learning for Classification of Cancer Cells

Using Machine Learning for Classification of Cancer Cells Using Machine Learning for Classification of Cancer Cells Camille Biscarrat University of California, Berkeley I Introduction Cell screening is a commonly used technique in the development of new drugs.

More information

arxiv: v3 [cs.cv] 22 Feb 2018

arxiv: v3 [cs.cv] 22 Feb 2018 A Closed-form Solution to Photorealistic Image Stylization Yijun Li 1, Ming-Yu Liu 2, Xueting Li 1, Ming-Hsuan Yang 1,2, and Jan Kautz 2 1 University of California, Merced 2 NVIDIA {yli62,xli75,mhyang}@ucmerced.edu

More information

DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material

DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material Yi Li 1, Gu Wang 1, Xiangyang Ji 1, Yu Xiang 2, and Dieter Fox 2 1 Tsinghua University, BNRist 2 University of Washington

More information

Topics to be Covered in the Rest of the Semester. CSci 4968 and 6270 Computational Vision Lecture 15 Overview of Remainder of the Semester

Topics to be Covered in the Rest of the Semester. CSci 4968 and 6270 Computational Vision Lecture 15 Overview of Remainder of the Semester Topics to be Covered in the Rest of the Semester CSci 4968 and 6270 Computational Vision Lecture 15 Overview of Remainder of the Semester Charles Stewart Department of Computer Science Rensselaer Polytechnic

More information

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU, Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image

More information

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why?

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why? Data Mining Deep Learning Deep Learning provided breakthrough results in speech recognition and image classification. Why? Because Speech recognition and image classification are two basic examples of

More information

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun Presented by Tushar Bansal Objective 1. Get bounding box for all objects

More information

RSRN: Rich Side-output Residual Network for Medial Axis Detection

RSRN: Rich Side-output Residual Network for Medial Axis Detection RSRN: Rich Side-output Residual Network for Medial Axis Detection Chang Liu, Wei Ke, Jianbin Jiao, and Qixiang Ye University of Chinese Academy of Sciences, Beijing, China {liuchang615, kewei11}@mails.ucas.ac.cn,

More information

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs

DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs Zhipeng Yan, Moyuan Huang, Hao Jiang 5/1/2017 1 Outline Background semantic segmentation Objective,

More information

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks Jakob Verbeek 2017-2018 Biological motivation Neuron is basic computational unit of the brain about 10^11 neurons in human brain Simplified neuron model as linear threshold

More information

INTRODUCTION TO DEEP LEARNING

INTRODUCTION TO DEEP LEARNING INTRODUCTION TO DEEP LEARNING CONTENTS Introduction to deep learning Contents 1. Examples 2. Machine learning 3. Neural networks 4. Deep learning 5. Convolutional neural networks 6. Conclusion 7. Additional

More information

3D Shape Analysis with Multi-view Convolutional Networks. Evangelos Kalogerakis

3D Shape Analysis with Multi-view Convolutional Networks. Evangelos Kalogerakis 3D Shape Analysis with Multi-view Convolutional Networks Evangelos Kalogerakis 3D model repositories [3D Warehouse - video] 3D geometry acquisition [KinectFusion - video] 3D shapes come in various flavors

More information

A Closed-form Solution to Photorealistic Image Stylization

A Closed-form Solution to Photorealistic Image Stylization A Closed-form Solution to Photorealistic Image Stylization Yijun Li 1, Ming-Yu Liu 2, Xueting Li 1, Ming-Hsuan Yang 1,2, Jan Kautz 2 1 University of California, Merced 2 NVIDIA {yli62,xli75,mhyang}@ucmerced.edu

More information

arxiv: v2 [cs.cv] 11 Apr 2017

arxiv: v2 [cs.cv] 11 Apr 2017 Multimodal Transfer: A Hierarchical Deep Convolutional Neural Network for Fast Artistic Style Transfer Xin Wang 1,2, Geoffrey Oxholm 2, Da Zhang 1, Yuan-Fang Wang 1 arxiv:1612.01895v2 [cs.cv] 11 Apr 2017

More information

From processing to learning on graphs

From processing to learning on graphs From processing to learning on graphs Patrick Pérez Maths and Images in Paris IHP, 2 March 2017 Signals on graphs Natural graph: mesh, network, etc., related to a real structure, various signals can live

More information

Real-time Coherent Video Style Transfer Final Report. Name: Gu Derun UID:

Real-time Coherent Video Style Transfer Final Report. Name: Gu Derun UID: Real-time Coherent Video Style Transfer Final Report Name: Gu Derun UID: 3035140146 April 15, 2018 Abstract Existing image style transfer models usually suffer from high temporal inconsistency when applied

More information

Video Frame Interpolation Using Recurrent Convolutional Layers

Video Frame Interpolation Using Recurrent Convolutional Layers 2018 IEEE Fourth International Conference on Multimedia Big Data (BigMM) Video Frame Interpolation Using Recurrent Convolutional Layers Zhifeng Zhang 1, Li Song 1,2, Rong Xie 2, Li Chen 1 1 Institute of

More information

Multi-Glance Attention Models For Image Classification

Multi-Glance Attention Models For Image Classification Multi-Glance Attention Models For Image Classification Chinmay Duvedi Stanford University Stanford, CA cduvedi@stanford.edu Pararth Shah Stanford University Stanford, CA pararth@stanford.edu Abstract We

More information

Use of Shape Deformation to Seamlessly Stitch Historical Document Images

Use of Shape Deformation to Seamlessly Stitch Historical Document Images Use of Shape Deformation to Seamlessly Stitch Historical Document Images Wei Liu Wei Fan Li Chen Jun Sun Satoshi Naoi In China, efforts are being made to preserve historical documents in the form of digital

More information

Example-Based Image Super-Resolution Techniques

Example-Based Image Super-Resolution Techniques Example-Based Image Super-Resolution Techniques Mark Sabini msabini & Gili Rusak gili December 17, 2016 1 Introduction With the current surge in popularity of imagebased applications, improving content

More information

Supervised Learning in Neural Networks (Part 2)

Supervised Learning in Neural Networks (Part 2) Supervised Learning in Neural Networks (Part 2) Multilayer neural networks (back-propagation training algorithm) The input signals are propagated in a forward direction on a layer-bylayer basis. Learning

More information

Boosting face recognition via neural Super-Resolution

Boosting face recognition via neural Super-Resolution Boosting face recognition via neural Super-Resolution Guillaume Berger, Cle ment Peyrard and Moez Baccouche Orange Labs - 4 rue du Clos Courtel, 35510 Cesson-Se vigne - France Abstract. We propose a two-step

More information

Know your data - many types of networks

Know your data - many types of networks Architectures Know your data - many types of networks Fixed length representation Variable length representation Online video sequences, or samples of different sizes Images Specific architectures for

More information

A Hierarchial Model for Visual Perception

A Hierarchial Model for Visual Perception A Hierarchial Model for Visual Perception Bolei Zhou 1 and Liqing Zhang 2 1 MOE-Microsoft Laboratory for Intelligent Computing and Intelligent Systems, and Department of Biomedical Engineering, Shanghai

More information

Image Resizing Based on Gradient Vector Flow Analysis

Image Resizing Based on Gradient Vector Flow Analysis Image Resizing Based on Gradient Vector Flow Analysis Sebastiano Battiato battiato@dmi.unict.it Giovanni Puglisi puglisi@dmi.unict.it Giovanni Maria Farinella gfarinellao@dmi.unict.it Daniele Ravì rav@dmi.unict.it

More information

Detecting Bone Lesions in Multiple Myeloma Patients using Transfer Learning

Detecting Bone Lesions in Multiple Myeloma Patients using Transfer Learning Detecting Bone Lesions in Multiple Myeloma Patients using Transfer Learning Matthias Perkonigg 1, Johannes Hofmanninger 1, Björn Menze 2, Marc-André Weber 3, and Georg Langs 1 1 Computational Imaging Research

More information

Chapter 7. Conclusions and Future Work

Chapter 7. Conclusions and Future Work Chapter 7 Conclusions and Future Work In this dissertation, we have presented a new way of analyzing a basic building block in computer graphics rendering algorithms the computational interaction between

More information

Deep Photo Style Transfer

Deep Photo Style Transfer Deep Photo Style Transfer Fujun Luan Cornell University Sylvain Paris Adobe Eli Shechtman Adobe Kavita Bala Cornell University fujun@cs.cornell.edu sparis@adobe.com elishe@adobe.com kb@cs.cornell.edu Figure

More information

C-Brain: A Deep Learning Accelerator

C-Brain: A Deep Learning Accelerator C-Brain: A Deep Learning Accelerator that Tames the Diversity of CNNs through Adaptive Data-level Parallelization Lili Song, Ying Wang, Yinhe Han, Xin Zhao, Bosheng Liu, Xiaowei Li State Key Laboratory

More information

ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems (Supplementary Materials)

ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems (Supplementary Materials) ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems (Supplementary Materials) Yinda Zhang 1,2, Sameh Khamis 1, Christoph Rhemann 1, Julien Valentin 1, Adarsh Kowdle 1, Vladimir

More information

TEXT SEGMENTATION ON PHOTOREALISTIC IMAGES

TEXT SEGMENTATION ON PHOTOREALISTIC IMAGES TEXT SEGMENTATION ON PHOTOREALISTIC IMAGES Valery Grishkin a, Alexander Ebral b, Nikolai Stepenko c, Jean Sene d Saint Petersburg State University, 7 9 Universitetskaya nab., Saint Petersburg, 199034,

More information

Edge Detection Using Convolutional Neural Network

Edge Detection Using Convolutional Neural Network Edge Detection Using Convolutional Neural Network Ruohui Wang (B) Department of Information Engineering, The Chinese University of Hong Kong, Hong Kong, China wr013@ie.cuhk.edu.hk Abstract. In this work,

More information

arxiv: v2 [cs.gr] 1 Feb 2017

arxiv: v2 [cs.gr] 1 Feb 2017 Stable and Controllable Neural Texture Synthesis and Style Transfer Using Histogram Losses arxiv:1701.08893v2 [cs.gr] 1 Feb 2017 Eric Risser1, Pierre Wilmot1, Connelly Barnes1,2 1 Artomatix, 2 University

More information

Perceptron: This is convolution!

Perceptron: This is convolution! Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image

More information

An efficient face recognition algorithm based on multi-kernel regularization learning

An efficient face recognition algorithm based on multi-kernel regularization learning Acta Technica 61, No. 4A/2016, 75 84 c 2017 Institute of Thermomechanics CAS, v.v.i. An efficient face recognition algorithm based on multi-kernel regularization learning Bi Rongrong 1 Abstract. A novel

More information

Deep neural networks II

Deep neural networks II Deep neural networks II May 31 st, 2018 Yong Jae Lee UC Davis Many slides from Rob Fergus, Svetlana Lazebnik, Jia-Bin Huang, Derek Hoiem, Adriana Kovashka, Why (convolutional) neural networks? State of

More information

Real-Time Neural Style Transfer for Videos

Real-Time Neural Style Transfer for Videos Real-Time Neural Style Transfer for Videos Haozhi Huang Hao Wang Wenhan Luo Lin Ma Wenhao Jiang Xiaolong Zhu Zhifeng Li Wei Liu Tsinghua University Tencent AI Lab Correspondence: huanghz08@gmail.com wliu@ee.columbia.edu

More information

arxiv: v2 [cs.cv] 21 May 2018

arxiv: v2 [cs.cv] 21 May 2018 Learning Selfie-Friendly Abstraction from Artistic Style Images Yicun Liu Jimmy Ren Jianbo Liu Jiawei Zhang Xiaohao Chen SenseTime Research {liuyicun,rensijie,liujianbo,zhangjiawei,chenxiaohao}@sensetime.com

More information

Estimating Human Pose in Images. Navraj Singh December 11, 2009

Estimating Human Pose in Images. Navraj Singh December 11, 2009 Estimating Human Pose in Images Navraj Singh December 11, 2009 Introduction This project attempts to improve the performance of an existing method of estimating the pose of humans in still images. Tasks

More information

Supplementary Material for Synthesizing Normalized Faces from Facial Identity Features

Supplementary Material for Synthesizing Normalized Faces from Facial Identity Features Supplementary Material for Synthesizing Normalized Faces from Facial Identity Features Forrester Cole 1 David Belanger 1,2 Dilip Krishnan 1 Aaron Sarna 1 Inbar Mosseri 1 William T. Freeman 1,3 1 Google,

More information