Deep Learning Approaches to 3D Shape Completion

Size: px
Start display at page:

Download "Deep Learning Approaches to 3D Shape Completion"

Transcription

1 Deep Learning Approaches to 3D Shape Completion Prafull Sharma Stanford University Jarrod Cingel Stanford University Abstract This project explores various methods rooted in deep learning to address the problem of 3D inpainting. We investigate shallow autoencoders, deep convolutional autoencoders, and generative adversarial networks and their applications to shape completion in three dimensions. We find that deep convolutional autoencoders serve as robust tools to complete 3D shapes belonging to a distinct taxonomical object group, training a model that completes predetermined known cut regions with over 97% accuracy and completes random cut regions with over 94% accuracy. We also assess potential shortcomings of shallow and convolutional autoencoders for this purpose, training one of each to restore both deterministic and random cuts in three dimensional voxelized models. Finally, we address GANs and their potential use in solving this problem. 1. Introduction Deep learning is influencing different fields including computer vision, robotics, natural language processing, and others. As visual data grows in the future, it is important to have algorithms/systems to restore corrupted or partial images. Inpainting is a technique used to reconstruct damaged or missing parts of an image. This has many application such as recovering corrupt data in image files. In case of the 2D inpainting in images, the input is a 2D image and a 2D mask which is the area in the image that needs to be painted. The challenging part of the task is to reconstruct the masked region in a visually believable way. In this paper, we use the same technique to address 3D reconstruction. Shape completion is a well researched area in computer graphics and vision, especially in processing partial 3D CAD models [5] [8]. We will work with 3D models and try to recover a missing part of the 3D model. We present two methods to perform this task, auto-encoders and deep convolutional generative adversarial networks (DCGANs). Our paper is organized as follows. In Section 2, we discuss some of the previous work in image inpainting and 3D shape completion. Then, we present information about our dataset in Section 3 followed by methodology in Section 4. In Section 5, we present the results from the techniques used by us with a discussion. The paper is concluded in Section 6 with a brief summary of our work and future work that we would pursue with the CVGL lab under supervision of Prof. Savarese. 2. Previous Work One of the seminal works in image inpainting was presented by Bertalmio et al. [2]. This paper presented an algorithm for digital inpainting which replicated the techniques used by professional artists. The main idea presented in the paper is to smoothly propagate information from the surrounding areas of the mask in the isophotes direction to reconstruct the missing region of the image. The results in the paper were sharp and did not have any color artifacts. One of the major drawbacks of the algorithm in the paper was the inability to reconstruct large textured region. The authors followed up their work with a paper on simultaneous structure and texture image inpainting [3]. In this paper, they presented an algorithm for simultaneously filling texture and structure in the missing part of the image. The basic idea discussed in the paper is to decompose the image into the sum of two functions having different basic characteristics. Then, they reconstruct each of the functions with structure and texture filling algorithms. The functions are decomposed in a way that the first function represents the underlying image structure while the second function represents the texture and noise. The image is then recovered by adding back the two decomposed sub-images. Similar ideas were followed in the 3D domain to completed the missing regions of a 3D scan. An examplebased 3D shape completion was presented by Pauly et al. [8]. They discuss a new method to obtain a complete 3D model from incomplete surface scans using a dataset of 3D shapes as prior for regions of the missing data. Their methods chooses a few models and warps them to conform with the input data. Then these models are consistently blended to obtain the resulting 3D shape. They use a penalty function for shape matching and a corresponding optimization scheme to compute the non-rigid alignment of the con- 1

2 text models with the input data. Their method achieved an efficient reconstruction from the highly incomplete scans which allowed easy 3D content acquisition with simple 3D scans. In the paper Shape Completion using 3D-Encoder- Predictor CNNs and Shape Synthesis, Dai et al. present a deep learning approach to shape completion using volumetric deep neural network and 3D shape synthesis [5]. They introduced a 3D-Encoder-Predictor Network (3D- EPN), which is trained to predict and fill the missing data in the 3D models. They encode both known and unknown spaces which allows them to predict the global structure in unknown area with high accuracy. Using a patch-based 3D shape synthesis method, they were able to reconstruct fine details on the surface and generate high resolution outputs. 3. Dataset Our dataset is comprised of voxelized models from ShapeNet, a large dataset of 2D and 3D image processing examples [4]. ShapeNet models are sorted and grouped by taxonomy, making it convenient to train on only a single class of objects. We arbitrarily chose the chair class, consisting of 6778 unique chair models. We divided these 6778 models into two sets. The first 6278 models were designated for the training set, and the remaining 500 were set aside for the validation set. Once the training and validation models were separated, we used two different methods of voxel cutting for the study. Both methods resulted in the removal of a voxel cube from somewhere in the model. In the first method, this cube was cut away from a fixed location, the back-bottom-right corner of each model. This was designed with the intention of cutting out the back leg of every chair (of course, variations in chair type would sometimes mean that more or less than a single leg ultimately got cut away). The second cutting method removed a cube of voxels from a randomly generated location in the model. These cuts were made randomly from each model and were not consistent between models. The only stipulation was that cuts must be centered over a region that contains a positive voxel, unless it is not physically possible given the model; this avoids selecting empty regions of each model to be cut away, which would essentially be useless to the model s training. It is important to note here that for both the deterministic and nondeterministic cutting methods, the location of the cube to be cut away was stored as a mask and passed onto the autoencoder. This way the models we discuss in subsequent sections had information about which parts of the original voxelized model were obfuscated away in order to aid the training process. 4. Methodology 4.1. Autoencoder An autoencoder is comprised of two models working in tandem, an encoder model and a decoder model [1]. The encoder portion takes in some input and maps it to a representation in an alternate dimensional space. The decoder portion accepts some encoded data as input and then attempts to reconstruct the original input to the encoder layer as closely as possible. We can formalize this general autoencoder model as follows: y = s(w x + b) z = s(w y + b ) Here, x signifies the initial input, y is the output of the encoder, z is the output of the decoder, and W, b, W, b are learned parameters. It is important to note that the ultimate objective is generally to minimize the difference between z and x. There are several viable objective functions to accomplish this task, the most common being the hinge loss and the cross-entropy loss. In order to make autoencoders better suited toward more robust representations, we can incorporate deep learning into the encoder and decoder layers. Specifically, we can use convolutional neural networks (CNNs) as the encoder and decoder models. This allows more sophisticated features to be learned from the data, since CNNs are much better able to interpret multi-layer representations like images or 3D models than traditional learning approaches. For this project, we experimented with shallow and convolutional autoencoders on both the deterministically-cut and randomly-cut voxelized model datasets Shallow Autoencoder In order to implement the shallow autoencoder, we used Keras with a TensorFlow backend. The original and cut voxelized models are loaded and flattened into dimensional arrays. There is also a mask that is added which indicates the cut area (in the mask, entries corresponding to the cut cube are given a value of 1, and untouched areas are given a value of 0). The mask is then flattened as well. The full structure of the shallow autoencoder is depicted in 1. Our shallow encoder has two input layers, the flattened cut model and the flattened mask. These are then concatenated and encoded into 32- dimensional dense feature space with a rectified linear unit (ReLU) activation function. The final layer uses a sigmoid activation function to output a flattened dimensional array, which can be reshaped into a voxelized model. Since each entry here corresponds to a probability 2

3 space). These three units use progressively decreasing filter counts with ReLU activation, mimmicking the inverse of the encoder s structure, with upsampling of size 2 on each axis. Before our final 3D convolution, we add a dropout with probability 0.4 in order to help reduce overfitting. Finally, our last 3D convolution level restores the original desired dimensionality, and its sigmoidal activation outputs a probability for the presence of every voxel. In the convolutional autoencoder, we make one more final modification. Since we do not want our loss function to be concerned with the known areas (sigmoidal activation will never output exactly 0 or exactly 1 in practice), we use our mask to replace probability values with binary values for the known portions that remain untouched. This functionality is accomplished by the final multiplication and addition layers in the model diagram. We use the binary cross-entropy loss function for training. Both models were also trained on each cutting dataset: once on the deterministic cuts, and once on the random cuts. Training was completed on Google Cloud virtual machine instances using NVIDIA Tesla GPU in order to fit the model in a reasonable timeframe. Figure 1. Shallow Autoencoder Graphical Representation that a voxel is present in that space, a threshold can be used to covert this to a binary voxelized grid Deep Convolutional Autoencoder The deep convolutional autoencoder has several key differences from the shallow version. It s structure can be seen in 2. The input and mask are defined the same as before, but this time they are not flattened. Instead, they are concatenated and then go through the encoder layer, which is a series of three units of 3D convolution layers and 3D max pooling layers. The 3D convolution is used in order to learn more complex representations at each level, and the max pooling layers are used to downsample and prevent overfitting. Each convolution layer uses ReLU activation with double the number of filters as the previous level (the first level has 64, so level 2 has 128 and level 3 has 256), has a kernel, and has a unit stride. The increasing filter count was selected to help take some of the burden off of the decoder layer. This gives our encoder feature space a dimensionality of (4, 4, 4, 256). Our decoder layer is a series of three units of 3D convolution layers and 3D upsampling layers. These are used to return our encoded input into the proper dimensionality of the output feature space (same as the input feature 4.2. Generative Adversarial Network Another method that we tried was the Generative Adversarial Networks (GANs). GANs were introduced in a paper by Goodfellow et al. as a training method for generative models [6]. A GAN is comprised of two networks, discriminator and generator. Discriminator is a classifier which takes in the data and classifies it as real or fake. Generator is responsible for generating the input for the discriminator and improves by generating data which is closer to real data. This setup can be thought as a minimax setup, formulated in the equation below. minimize G maximize E x pdata [log D(x)] + E z p(z) [log (1 D(G(z)))] D We explored the 3D-GAN model on our dataset by Wu et al. [10]. This method generated 3D objects from a probabilistic space by using volumetric convolutional networks and GANs Deep Convolutional Generative Adversarial Networks Deep Convolutional Generative Adversarial Networks (DC- GANs) were introduced by Radford et al. [9]. In this model of GANs, both generator and discriminator are convolutional neural networks (CNNs). This method was also used by Yeh et al. for performing semantic image inpainting [11]. We used a similar approach but adapted it for our 3D mod- 3

4 els. Our discriminator had two 3D convolutional layers with leaky relu as the activation. Each convolution layer is followed by 3D maxpool layer to reduce the dimensionality of the input. Then, we have two fully connected layers which output a score for each of the samples in the batch. The generator uses fully connected layers and then uses ConvTranspose layer to upscale the input to generate a volume of We use two different loss functions for discriminator and generator respectively. Both loss functions are based on least squares adapted from the paper on Least Squares Generative Adversarial Networks by Mao et al. [7]. The generator loss is as follows: l G = 1 2 E z p(z) [(D(G(z)) 1) 2] and the discriminator loss: l D = 1 2 E x p data [(D(x) 1) 2] E z p(z) Evaluation Metric [(D(G(z))) 2] In addition to cross-entropy validation loss of the trained models, we have devised a more concrete metric to evaluate accuracy. For each trained model and each cut type, we make cuts in a test set of voxelized models, feed these cut versions into the trained autoencoder/gan, and then return probability models. At each probability p i in this output, we define a threshold α such that for each corresponding point in the voxelized output model o i, we have: { 0 if p i α o i = 1 otherwise Once the output model O is produced as a voxelized model consisting of all the o i s, then O can be compared with the original model M. Treating O and M as matrices of shape (32, 32, 32, 1), then we can compute the total number of erroneous voxels as follows: d(o, M) = O M ijk i=1 j=1 k=1 Figure 2. Deep Autoencoder Graphical Representation This results in computing the number of mistakes in a single reconstruction (i.e. counting the number of nonzero entries in the absolute value of the difference between O and M). We then compute the average number of errors in each attempted reconstruction: i test set err avg = d i(o, M) test set 4

5 Finally, we compute and report the average accuracy in reconstructing each cut region specifically, letting l be the length of the cubic cut region: Accuracy = 1 err avg l 3 In our case, we chose α = 0.5 and l = 12, denoting threshold and cut length values, respectively. Our training set consisted of 500 chair models, and we would perform this process for both the deterministic and random cuts. (a) Original model. 5. Results (b) Model with a cut. (a) Original model. (c) Reconstructed model. Figure 4. Shallow Autoencoder Random Reconstruction Example 5.1. Autoencoder (b) Cut Model There are a total of four models that we trained, listed as follows: 1. Shallow autoencoder, deterministic cuts 2. Shallow autoencoder, random cuts 3. Deep convolutional autoencoder, deterministic cuts 4. Deep convolutional autoencoder, random cuts (c) Shallow Reconstruction Figure 3. Shallow Autoencoder Deterministic Reconstruction Example Once each model was trained, we tested it on a test set consisting of 500 3D voxelized models. We recorded key statistics about the model in Table 1, and also computed the average accuracy within the reconstructed region using the metric described in section 4.2. We can see that the deep convolutional autoencoders generally performed better than their shallow counterparts, although the deterministic cut results are very close and 5

6 Cross-entropy Validation Loss Number of Epochs Metric Accuracy Shallow AE, Deterministic Cuts % Shallow AE, Random Cuts % Deep AE, Deterministic Cuts % Deep AE, Random Cuts % Table 1. Result table of autoencoder performances. (a) Original model. (a) Original model. (b) Model with a cut. (b) Model with a cut. (c) Reconstructed model. Figure 5. Deep Autoencoder Deterministic Reconstruction Example (c) Reconstructed model. Figure 6. Deep Autoencoder Deterministic Reconstruction Example nearly comparable. We can also see that the reconstruction accuracy on randomly cut regions is lower than the accuracy on the deterministically chosen cuts. This is most likely a result of the numerous combinations of viable random cut positions combined with many different types of chairs that all look distinct. It is a lot to ask of the decoder to reconstruct an arbitrary region in an arbitrary shape in the random case, in contrast to the much simpler task of reconstructing what is almost always part of the chair s back leg. Let us first examine the shallow autoencoder operating on a deterministic cut in Figure 3. We can see that the reconstruction, even with the shallow autoencoder is almost perfect, with only slight discrepancies in the upper decorative edge of the chair. This makes sense since the accuracies for deep and shallow autoencoders were almost identical. 6

7 Next, examine the shallow autoencoder operating on a random cut in Figure 4 We can see that this reconstruction is quite flawed, as the shallow autoencoder is simply not robust enough to represent the complexities associated with cuts in random positions. We can see significant artifacts in the reconstruction of the chair back, and this confirms the fact that our accuracy for this category was significantly lower than the other accuracies. In Figure 5 we have a case where the deep convolutional autoencoder reconstructed the back leg portion of a chair model (deterministic cuts). As we can see, it performed fairly well, and this seemed to be the trend for most of the data: In the example, we can see that the area was reconstructed almost perfectly, with the exception of being 1 voxel too wide toward the back and 1 voxel too high in the front. Areas like this, which the model sees as borderline between having and omitting a voxel are naturally the most tough for the model to decipher, since these will have probability values very close to 0.5. This performance could potentially be improved by modifying the threshold described in section 4.2, but it seems reasonable to expect a single voxel worth of error in questionable regions. This is because a probability very close to 0.5 is returned for voxels adjacent to regions that should be positive. Overall, however, this reconstruction is very promising, since it correctly reconstructs over 0.97 of the region that was cut away. Next, we have a case where the deep convolutional autoencoder reconstructed the random portions of a chair models, refer to Figure 6. Although performance was weaker than in the deterministic cut data, it still was acceptable overall: In this example, we can see that the portion cut away included a large chunk of the chair s seat as well as the top of its front leg. The seat portion was reconstructed very nicely, but the area where the leg connects to the chair seat is missing. This is probably a result of the autoencoder not seeing enough examples of chair leg connecting to chair seat during the training process. We did achieve reconstruction of chair as shown in Figure 7. We tried to run our DCGAN on our dataset with several different parameters and learning rates but couldn t achieve any publishable results. Due to the lack of compute power, we were unable to experiment with the DCGAN to achieve good results. In terms of our architecture for the DCGAN, we think that we should have used the same convolutional model architecture as in our autoencoder. Figure 7. Reconstruction of a chair from scratch. 6. Conclusion Deep learning methods seem to be very useful in their capacity to solve the 3D shape completion problem. We have trained both shallow and deep autoencoders suited for inpainting 3D models of chairs from ShapeNet with both deterministic, constant cuts as well as random cuts anywhere in the model. We managed to achieve upwards of 97% and 94% accuracy on each respective category, which is a significant benchmark and creates visually similar reconstructions. We also explored the use of GANs for this task, as well. In the future, we hope to continue our work with Professor Savarese in the CVGL lab. We plan to explore different autoencoder designs, as well as optimal threshold reconstruction values. Specifically, an interesting problem would be to learn the optimal reconstruction threshold α value over a significantly larger test dataset. We also hope to further our work with GANs to attain more consistent results. Thanks to Professor Savarese, Trevor Standley, Chris Choy, and Lynne Tchapmi for their guidance and mentorship during the course of the project. 7

8 6.1. References References [1] P. Baldi. Autoencoders, unsupervised learning, and deep architectures. ICML unsupervised and transfer learning, 27(37-50):1, [2] M. Bertalmio, G. Sapiro, V. Caselles, and C. Ballester. Image inpainting. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, SIG- GRAPH 00, pages , New York, NY, USA, ACM Press/Addison-Wesley Publishing Co. [3] M. Bertalmio, L. Vese, G. Sapiro, and S. Osher. Simultaneous structure and texture image inpainting. IEEE transactions on image processing, 12(8): , [4] A. X. Chang, T. A. Funkhouser, L. J. Guibas, P. Hanrahan, Q. Huang, Z. Li, S. Savarese, M. Savva, S. Song, H. Su, J. Xiao, L. Yi, and F. Yu. Shapenet: An information-rich 3d model repository. CoRR, abs/ , [5] A. Dai, C. R. Qi, and M. Nießner. Shape completion using 3d-encoder-predictor cnns and shape synthesis. arxiv preprint arxiv: , [6] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Advances in neural information processing systems, pages , [7] X. Mao, Q. Li, H. Xie, R. Y. K. Lau, and Z. Wang. Multiclass generative adversarial networks with the L2 loss function. CoRR, abs/ , [8] M. Pauly, N. J. Mitra, J. Giesen, M. H. Gross, and L. J. Guibas. Example-based 3d scan completion. In Symposium on Geometry Processing, number EPFL-CONF , pages 23 32, [9] A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. CoRR, abs/ , [10] J. Wu, C. Zhang, T. Xue, W. T. Freeman, and J. B. Tenenbaum. Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. In Advances in Neural Information Processing Systems, pages 82 90, [11] R. Yeh, C. Chen, T. Lim, M. Hasegawa-Johnson, and M. N. Do. Semantic image inpainting with perceptual and contextual losses. CoRR, abs/ ,

An Empirical Study of Generative Adversarial Networks for Computer Vision Tasks

An Empirical Study of Generative Adversarial Networks for Computer Vision Tasks An Empirical Study of Generative Adversarial Networks for Computer Vision Tasks Report for Undergraduate Project - CS396A Vinayak Tantia (Roll No: 14805) Guide: Prof Gaurav Sharma CSE, IIT Kanpur, India

More information

Paired 3D Model Generation with Conditional Generative Adversarial Networks

Paired 3D Model Generation with Conditional Generative Adversarial Networks Accepted to 3D Reconstruction in the Wild Workshop European Conference on Computer Vision (ECCV) 2018 Paired 3D Model Generation with Conditional Generative Adversarial Networks Cihan Öngün Alptekin Temizel

More information

3D model classification using convolutional neural network

3D model classification using convolutional neural network 3D model classification using convolutional neural network JunYoung Gwak Stanford jgwak@cs.stanford.edu Abstract Our goal is to classify 3D models directly using convolutional neural network. Most of existing

More information

Learning Adversarial 3D Model Generation with 2D Image Enhancer

Learning Adversarial 3D Model Generation with 2D Image Enhancer The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18) Learning Adversarial 3D Model Generation with 2D Image Enhancer Jing Zhu, Jin Xie, Yi Fang NYU Multimedia and Visual Computing Lab

More information

Cross-domain Deep Encoding for 3D Voxels and 2D Images

Cross-domain Deep Encoding for 3D Voxels and 2D Images Cross-domain Deep Encoding for 3D Voxels and 2D Images Jingwei Ji Stanford University jingweij@stanford.edu Danyang Wang Stanford University danyangw@stanford.edu 1. Introduction 3D reconstruction is one

More information

Learning to generate with adversarial networks

Learning to generate with adversarial networks Learning to generate with adversarial networks Gilles Louppe June 27, 2016 Problem statement Assume training samples D = {x x p data, x X } ; We want a generative model p model that can draw new samples

More information

arxiv: v1 [cs.cv] 5 Jul 2017

arxiv: v1 [cs.cv] 5 Jul 2017 AlignGAN: Learning to Align Cross- Images with Conditional Generative Adversarial Networks Xudong Mao Department of Computer Science City University of Hong Kong xudonmao@gmail.com Qing Li Department of

More information

Progress on Generative Adversarial Networks

Progress on Generative Adversarial Networks Progress on Generative Adversarial Networks Wangmeng Zuo Vision Perception and Cognition Centre Harbin Institute of Technology Content Image generation: problem formulation Three issues about GAN Discriminate

More information

Introduction to Generative Adversarial Networks

Introduction to Generative Adversarial Networks Introduction to Generative Adversarial Networks Luke de Oliveira Vai Technologies Lawrence Berkeley National Laboratory @lukede0 @lukedeo lukedeo@vaitech.io https://ldo.io 1 Outline Why Generative Modeling?

More information

MULTI-LEVEL 3D CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION SAMBIT GHADAI XIAN LEE ADITYA BALU SOUMIK SARKAR ADARSH KRISHNAMURTHY

MULTI-LEVEL 3D CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION SAMBIT GHADAI XIAN LEE ADITYA BALU SOUMIK SARKAR ADARSH KRISHNAMURTHY MULTI-LEVEL 3D CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION SAMBIT GHADAI XIAN LEE ADITYA BALU SOUMIK SARKAR ADARSH KRISHNAMURTHY Outline Object Recognition Multi-Level Volumetric Representations

More information

Autoencoders. Stephen Scott. Introduction. Basic Idea. Stacked AE. Denoising AE. Sparse AE. Contractive AE. Variational AE GAN.

Autoencoders. Stephen Scott. Introduction. Basic Idea. Stacked AE. Denoising AE. Sparse AE. Contractive AE. Variational AE GAN. Stacked Denoising Sparse Variational (Adapted from Paul Quint and Ian Goodfellow) Stacked Denoising Sparse Variational Autoencoding is training a network to replicate its input to its output Applications:

More information

Image Restoration with Deep Generative Models

Image Restoration with Deep Generative Models Image Restoration with Deep Generative Models Raymond A. Yeh *, Teck-Yian Lim *, Chen Chen, Alexander G. Schwing, Mark Hasegawa-Johnson, Minh N. Do Department of Electrical and Computer Engineering, University

More information

arxiv: v1 [eess.sp] 23 Oct 2018

arxiv: v1 [eess.sp] 23 Oct 2018 Reproducing AmbientGAN: Generative models from lossy measurements arxiv:1810.10108v1 [eess.sp] 23 Oct 2018 Mehdi Ahmadi Polytechnique Montreal mehdi.ahmadi@polymtl.ca Mostafa Abdelnaim University de Montreal

More information

arxiv: v1 [cs.cv] 7 Mar 2018

arxiv: v1 [cs.cv] 7 Mar 2018 Accepted as a conference paper at the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN) 2018 Inferencing Based on Unsupervised Learning of Disentangled

More information

arxiv: v1 [cs.ne] 11 Jun 2018

arxiv: v1 [cs.ne] 11 Jun 2018 Generative Adversarial Network Architectures For Image Synthesis Using Capsule Networks arxiv:1806.03796v1 [cs.ne] 11 Jun 2018 Yash Upadhyay University of Minnesota, Twin Cities Minneapolis, MN, 55414

More information

Volumetric and Multi-View CNNs for Object Classification on 3D Data Supplementary Material

Volumetric and Multi-View CNNs for Object Classification on 3D Data Supplementary Material Volumetric and Multi-View CNNs for Object Classification on 3D Data Supplementary Material Charles R. Qi Hao Su Matthias Nießner Angela Dai Mengyuan Yan Leonidas J. Guibas Stanford University 1. Details

More information

ECCV Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016

ECCV Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016 ECCV 2016 Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016 Fundamental Question What is a good vector representation of an object? Something that can be easily predicted from 2D

More information

Lab meeting (Paper review session) Stacked Generative Adversarial Networks

Lab meeting (Paper review session) Stacked Generative Adversarial Networks Lab meeting (Paper review session) Stacked Generative Adversarial Networks 2017. 02. 01. Saehoon Kim (Ph. D. candidate) Machine Learning Group Papers to be covered Stacked Generative Adversarial Networks

More information

arxiv: v1 [cs.cv] 25 Dec 2017

arxiv: v1 [cs.cv] 25 Dec 2017 Deep Blind Image Inpainting Yang Liu 1, Jinshan Pan 2, Zhixun Su 1 1 School of Mathematical Sciences, Dalian University of Technology 2 School of Computer Science and Engineering, Nanjing University of

More information

arxiv: v1 [cs.cv] 1 Nov 2018

arxiv: v1 [cs.cv] 1 Nov 2018 Examining Performance of Sketch-to-Image Translation Models with Multiclass Automatically Generated Paired Training Data Dichao Hu College of Computing, Georgia Institute of Technology, 801 Atlantic Dr

More information

DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION

DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION Yen-Cheng Liu 1, Wei-Chen Chiu 2, Sheng-De Wang 1, and Yu-Chiang Frank Wang 1 1 Graduate Institute of Electrical Engineering,

More information

Learning Social Graph Topologies using Generative Adversarial Neural Networks

Learning Social Graph Topologies using Generative Adversarial Neural Networks Learning Social Graph Topologies using Generative Adversarial Neural Networks Sahar Tavakoli 1, Alireza Hajibagheri 1, and Gita Sukthankar 1 1 University of Central Florida, Orlando, Florida sahar@knights.ucf.edu,alireza@eecs.ucf.edu,gitars@eecs.ucf.edu

More information

Unsupervised Learning

Unsupervised Learning Deep Learning for Graphics Unsupervised Learning Niloy Mitra Iasonas Kokkinos Paul Guerrero Vladimir Kim Kostas Rematas Tobias Ritschel UCL UCL/Facebook UCL Adobe Research U Washington UCL Timetable Niloy

More information

Alternatives to Direct Supervision

Alternatives to Direct Supervision CreativeAI: Deep Learning for Graphics Alternatives to Direct Supervision Niloy Mitra Iasonas Kokkinos Paul Guerrero Nils Thuerey Tobias Ritschel UCL UCL UCL TUM UCL Timetable Theory and Basics State of

More information

Supplementary Material: Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos

Supplementary Material: Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos Supplementary Material: Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos Kihyuk Sohn 1 Sifei Liu 2 Guangyu Zhong 3 Xiang Yu 1 Ming-Hsuan Yang 2 Manmohan Chandraker 1,4 1 NEC Labs

More information

arxiv: v1 [cs.cv] 17 Nov 2016

arxiv: v1 [cs.cv] 17 Nov 2016 Inverting The Generator Of A Generative Adversarial Network arxiv:1611.05644v1 [cs.cv] 17 Nov 2016 Antonia Creswell BICV Group Bioengineering Imperial College London ac2211@ic.ac.uk Abstract Anil Anthony

More information

SYNTHESIS OF IMAGES BY TWO-STAGE GENERATIVE ADVERSARIAL NETWORKS. Qiang Huang, Philip J.B. Jackson, Mark D. Plumbley, Wenwu Wang

SYNTHESIS OF IMAGES BY TWO-STAGE GENERATIVE ADVERSARIAL NETWORKS. Qiang Huang, Philip J.B. Jackson, Mark D. Plumbley, Wenwu Wang SYNTHESIS OF IMAGES BY TWO-STAGE GENERATIVE ADVERSARIAL NETWORKS Qiang Huang, Philip J.B. Jackson, Mark D. Plumbley, Wenwu Wang Centre for Vision, Speech and Signal Processing University of Surrey, Guildford,

More information

3D Deep Learning on Geometric Forms. Hao Su

3D Deep Learning on Geometric Forms. Hao Su 3D Deep Learning on Geometric Forms Hao Su Many 3D representations are available Candidates: multi-view images depth map volumetric polygonal mesh point cloud primitive-based CAD models 3D representation

More information

Generative Adversarial Network

Generative Adversarial Network Generative Adversarial Network Many slides from NIPS 2014 Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio Generative adversarial

More information

DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION

DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION 2017 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, SEPT. 25 28, 2017, TOKYO, JAPAN DOMAIN-ADAPTIVE GENERATIVE ADVERSARIAL NETWORKS FOR SKETCH-TO-PHOTO INVERSION Yen-Cheng Liu 1,

More information

Deep Fakes using Generative Adversarial Networks (GAN)

Deep Fakes using Generative Adversarial Networks (GAN) Deep Fakes using Generative Adversarial Networks (GAN) Tianxiang Shen UCSD La Jolla, USA tis038@eng.ucsd.edu Ruixian Liu UCSD La Jolla, USA rul188@eng.ucsd.edu Ju Bai UCSD La Jolla, USA jub010@eng.ucsd.edu

More information

POINT CLOUD DEEP LEARNING

POINT CLOUD DEEP LEARNING POINT CLOUD DEEP LEARNING Innfarn Yoo, 3/29/28 / 57 Introduction AGENDA Previous Work Method Result Conclusion 2 / 57 INTRODUCTION 3 / 57 2D OBJECT CLASSIFICATION Deep Learning for 2D Object Classification

More information

arxiv: v1 [cs.cv] 16 Jul 2017

arxiv: v1 [cs.cv] 16 Jul 2017 enerative adversarial network based on resnet for conditional image restoration Paper: jc*-**-**-****: enerative Adversarial Network based on Resnet for Conditional Image Restoration Meng Wang, Huafeng

More information

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU, Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image

More information

arxiv: v3 [cs.cv] 30 Oct 2017

arxiv: v3 [cs.cv] 30 Oct 2017 Improved Adversarial Systems for 3D Object Generation and Reconstruction Edward J. Smith Department of Computer Science McGill University Canada edward.smith@mail.mcgill.ca David Meger Department of Computer

More information

Visual Recommender System with Adversarial Generator-Encoder Networks

Visual Recommender System with Adversarial Generator-Encoder Networks Visual Recommender System with Adversarial Generator-Encoder Networks Bowen Yao Stanford University 450 Serra Mall, Stanford, CA 94305 boweny@stanford.edu Yilin Chen Stanford University 450 Serra Mall

More information

arxiv: v1 [stat.ml] 15 Feb 2018

arxiv: v1 [stat.ml] 15 Feb 2018 Conditioning of three-dimensional generative adversarial networks for pore and reservoir-scale models arxiv:1802.05622v1 [stat.ml] 15 Feb 2018 Lukas J. Mosser lukas.mosser15@imperial.ac.uk Martin J. Blunt

More information

Dynamic Routing Between Capsules

Dynamic Routing Between Capsules Report Explainable Machine Learning Dynamic Routing Between Capsules Author: Michael Dorkenwald Supervisor: Dr. Ullrich Köthe 28. Juni 2018 Inhaltsverzeichnis 1 Introduction 2 2 Motivation 2 3 CapusleNet

More information

Deep Learning for Visual Manipulation and Synthesis

Deep Learning for Visual Manipulation and Synthesis Deep Learning for Visual Manipulation and Synthesis Jun-Yan Zhu 朱俊彦 UC Berkeley 2017/01/11 @ VALSE What is visual manipulation? Image Editing Program input photo User Input result Desired output: stay

More information

MoonRiver: Deep Neural Network in C++

MoonRiver: Deep Neural Network in C++ MoonRiver: Deep Neural Network in C++ Chung-Yi Weng Computer Science & Engineering University of Washington chungyi@cs.washington.edu Abstract Artificial intelligence resurges with its dramatic improvement

More information

GENERATIVE ADVERSARIAL NETWORK-BASED VIR-

GENERATIVE ADVERSARIAL NETWORK-BASED VIR- GENERATIVE ADVERSARIAL NETWORK-BASED VIR- TUAL TRY-ON WITH CLOTHING REGION Shizuma Kubo, Yusuke Iwasawa, and Yutaka Matsuo The University of Tokyo Bunkyo-ku, Japan {kubo, iwasawa, matsuo}@weblab.t.u-tokyo.ac.jp

More information

Lung Tumor Segmentation via Fully Convolutional Neural Networks

Lung Tumor Segmentation via Fully Convolutional Neural Networks Lung Tumor Segmentation via Fully Convolutional Neural Networks Austin Ray Stanford University CS 231N, Winter 2016 aray@cs.stanford.edu Abstract Recently, researchers have made great strides in extracting

More information

arxiv: v1 [cs.gr] 22 Sep 2017

arxiv: v1 [cs.gr] 22 Sep 2017 Hierarchical Detail Enhancing Mesh-Based Shape Generation with 3D Generative Adversarial Network arxiv:1709.07581v1 [cs.gr] 22 Sep 2017 Chiyu Max Jiang University of California Berkeley, CA 94720 chiyu.jiang@berkeley.edu

More information

One Network to Solve Them All Solving Linear Inverse Problems using Deep Projection Models

One Network to Solve Them All Solving Linear Inverse Problems using Deep Projection Models One Network to Solve Them All Solving Linear Inverse Problems using Deep Projection Models [Supplemental Materials] 1. Network Architecture b ref b ref +1 We now describe the architecture of the networks

More information

Learning to generate 3D shapes

Learning to generate 3D shapes Learning to generate 3D shapes Subhransu Maji College of Information and Computer Sciences University of Massachusetts, Amherst http://people.cs.umass.edu/smaji August 10, 2018 @ Caltech Creating 3D shapes

More information

DCGANs for image super-resolution, denoising and debluring

DCGANs for image super-resolution, denoising and debluring DCGANs for image super-resolution, denoising and debluring Qiaojing Yan Stanford University Electrical Engineering qiaojing@stanford.edu Wei Wang Stanford University Electrical Engineering wwang23@stanford.edu

More information

(University Improving of Montreal) Generative Adversarial Networks with Denoising Feature Matching / 17

(University Improving of Montreal) Generative Adversarial Networks with Denoising Feature Matching / 17 Improving Generative Adversarial Networks with Denoising Feature Matching David Warde-Farley 1 Yoshua Bengio 1 1 University of Montreal, ICLR,2017 Presenter: Bargav Jayaraman Outline 1 Introduction 2 Background

More information

arxiv: v1 [cs.cv] 25 Aug 2018

arxiv: v1 [cs.cv] 25 Aug 2018 Painting Outside the Box: Image Outpainting with GANs Mark Sabini and Gili Rusak Stanford University {msabini, gilir}@cs.stanford.edu arxiv:1808.08483v1 [cs.cv] 25 Aug 2018 Abstract The challenging task

More information

GENERATIVE ADVERSARIAL NETWORKS (GAN) Presented by Omer Stein and Moran Rubin

GENERATIVE ADVERSARIAL NETWORKS (GAN) Presented by Omer Stein and Moran Rubin GENERATIVE ADVERSARIAL NETWORKS (GAN) Presented by Omer Stein and Moran Rubin GENERATIVE MODEL Given a training dataset, x, try to estimate the distribution, Pdata(x) Explicitly or Implicitly (GAN) Explicitly

More information

Stochastic Simulation with Generative Adversarial Networks

Stochastic Simulation with Generative Adversarial Networks Stochastic Simulation with Generative Adversarial Networks Lukas Mosser, Olivier Dubrule, Martin J. Blunt lukas.mosser15@imperial.ac.uk, o.dubrule@imperial.ac.uk, m.blunt@imperial.ac.uk (Deep) Generative

More information

Controllable Generative Adversarial Network

Controllable Generative Adversarial Network Controllable Generative Adversarial Network arxiv:1708.00598v2 [cs.lg] 12 Sep 2017 Minhyeok Lee 1 and Junhee Seok 1 1 School of Electrical Engineering, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul,

More information

Learning visual odometry with a convolutional network

Learning visual odometry with a convolutional network Learning visual odometry with a convolutional network Kishore Konda 1, Roland Memisevic 2 1 Goethe University Frankfurt 2 University of Montreal konda.kishorereddy@gmail.com, roland.memisevic@gmail.com

More information

Generative Adversarial Text to Image Synthesis

Generative Adversarial Text to Image Synthesis Generative Adversarial Text to Image Synthesis Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee Presented by: Jingyao Zhan Contents Introduction Related Work Method

More information

Learning Descriptor Networks for 3D Shape Synthesis and Analysis

Learning Descriptor Networks for 3D Shape Synthesis and Analysis Learning Descriptor Networks for 3D Shape Synthesis and Analysis Jianwen Xie 1, Zilong Zheng 2, Ruiqi Gao 2, Wenguan Wang 2,3, Song-Chun Zhu 2, Ying Nian Wu 2 1 Hikvision Research Institute 2 University

More information

arxiv: v2 [cs.cv] 16 Dec 2017

arxiv: v2 [cs.cv] 16 Dec 2017 CycleGAN, a Master of Steganography Casey Chu Stanford University caseychu@stanford.edu Andrey Zhmoginov Google Inc. azhmogin@google.com Mark Sandler Google Inc. sandler@google.com arxiv:1712.02950v2 [cs.cv]

More information

CS231N Project Final Report - Fast Mixed Style Transfer

CS231N Project Final Report - Fast Mixed Style Transfer CS231N Project Final Report - Fast Mixed Style Transfer Xueyuan Mei Stanford University Computer Science xmei9@stanford.edu Fabian Chan Stanford University Computer Science fabianc@stanford.edu Tianchang

More information

Seismic data reconstruction with Generative Adversarial Networks

Seismic data reconstruction with Generative Adversarial Networks Seismic data reconstruction with Generative Adversarial Networks Ali Siahkoohi 1, Rajiv Kumar 1,2 and Felix J. Herrmann 2 1 Seismic Laboratory for Imaging and Modeling (SLIM), The University of British

More information

Lecture 19: Generative Adversarial Networks

Lecture 19: Generative Adversarial Networks Lecture 19: Generative Adversarial Networks Roger Grosse 1 Introduction Generative modeling is a type of machine learning where the aim is to model the distribution that a given set of data (e.g. images,

More information

Light Field Occlusion Removal

Light Field Occlusion Removal Light Field Occlusion Removal Shannon Kao Stanford University kaos@stanford.edu Figure 1: Occlusion removal pipeline. The input image (left) is part of a focal stack representing a light field. Each image

More information

Building an Automatic Sprite Generator with Deep Convolutional Generative Adversarial Networks

Building an Automatic Sprite Generator with Deep Convolutional Generative Adversarial Networks Building an Automatic Sprite Generator with Deep Convolutional Generative Adversarial Networks Lewis Horsley School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK lhorsl@essex.ac.uk

More information

Image Transformation via Neural Network Inversion

Image Transformation via Neural Network Inversion Image Transformation via Neural Network Inversion Asha Anoosheh Rishi Kapadia Jared Rulison Abstract While prior experiments have shown it is possible to approximately reconstruct inputs to a neural net

More information

Show, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks

Show, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks Show, Discriminate, and Tell: A Discriminatory Image Captioning Model with Deep Neural Networks Zelun Luo Department of Computer Science Stanford University zelunluo@stanford.edu Te-Lin Wu Department of

More information

arxiv: v1 [cs.cv] 19 Apr 2017

arxiv: v1 [cs.cv] 19 Apr 2017 Generative Face Completion Yijun Li 1, Sifei Liu 1, Jimei Yang 2, and Ming-Hsuan Yang 1 1 University of California, Merced 2 Adobe Research {yli62,sliu32,mhyang}@ucmerced.edu jimyang@adobe.com arxiv:1704.05838v1

More information

arxiv: v1 [cs.cv] 21 Jun 2017

arxiv: v1 [cs.cv] 21 Jun 2017 Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction arxiv:1706.07036v1 [cs.cv] 21 Jun 2017 Chen-Hsuan Lin, Chen Kong, Simon Lucey The Robotics Institute Carnegie Mellon University

More information

Reconstructing Pore Networks Using Generative Adversarial Networks

Reconstructing Pore Networks Using Generative Adversarial Networks Reconstructing Pore Networks Using Generative Adversarial Networks Kelly Guan (kmguan@stanford.edu) I. INTRODUCTION Understanding fluid flow in porous media at the microscale is relevant to many fields,

More information

arxiv: v1 [cs.cv] 27 Jan 2019

arxiv: v1 [cs.cv] 27 Jan 2019 : Euclidean Point Cloud Auto-Encoder and Sampler Edoardo Remelli 1 Pierre Baque 1 Pascal Fua 1 Abstract arxiv:1901.09394v1 [cs.cv] 27 Jan 2019 Most algorithms that rely on deep learning-based approaches

More information

Vulnerability of machine learning models to adversarial examples

Vulnerability of machine learning models to adversarial examples Vulnerability of machine learning models to adversarial examples Petra Vidnerová Institute of Computer Science The Czech Academy of Sciences Hora Informaticae 1 Outline Introduction Works on adversarial

More information

arxiv: v1 [cs.cv] 1 Aug 2017

arxiv: v1 [cs.cv] 1 Aug 2017 Deep Generative Adversarial Neural Networks for Realistic Prostate Lesion MRI Synthesis Andy Kitchen a, Jarrel Seah b a,* Independent Researcher b STAT Innovations Pty. Ltd., PO Box 274, Ashburton VIC

More information

Generative Face Completion

Generative Face Completion Generative Face Completion Yijun Li 1, Sifei Liu 1, Jimei Yang 2, and Ming-Hsuan Yang 1 1 University of California, Merced 2 Adobe Research {yli62,sliu32,mhyang}@ucmerced.edu jimyang@adobe.com Abstract

More information

Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction

Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction Chen-Hsuan Lin Chen Kong Simon Lucey The Robotics Institute Carnegie Mellon University chlin@cmu.edu, {chenk,slucey}@cs.cmu.edu

More information

Channel Locality Block: A Variant of Squeeze-and-Excitation

Channel Locality Block: A Variant of Squeeze-and-Excitation Channel Locality Block: A Variant of Squeeze-and-Excitation 1 st Huayu Li Northern Arizona University Flagstaff, United State Northern Arizona University hl459@nau.edu arxiv:1901.01493v1 [cs.lg] 6 Jan

More information

arxiv: v1 [cs.cv] 26 Jul 2016

arxiv: v1 [cs.cv] 26 Jul 2016 Semantic Image Inpainting with Perceptual and Contextual Losses arxiv:1607.07539v1 [cs.cv] 26 Jul 2016 Raymond Yeh Chen Chen Teck Yian Lim, Mark Hasegawa-Johnson Minh N. Do Dept. of Electrical and Computer

More information

Stacked Denoising Autoencoders for Face Pose Normalization

Stacked Denoising Autoencoders for Face Pose Normalization Stacked Denoising Autoencoders for Face Pose Normalization Yoonseop Kang 1, Kang-Tae Lee 2,JihyunEun 2, Sung Eun Park 2 and Seungjin Choi 1 1 Department of Computer Science and Engineering Pohang University

More information

arxiv: v1 [cs.cv] 8 Jan 2019

arxiv: v1 [cs.cv] 8 Jan 2019 GILT: Generating Images from Long Text Ori Bar El, Ori Licht, Netanel Yosephian Tel-Aviv University {oribarel, oril, yosephian}@mail.tau.ac.il arxiv:1901.02404v1 [cs.cv] 8 Jan 2019 Abstract Creating an

More information

Attribute Augmented Convolutional Neural Network for Face Hallucination

Attribute Augmented Convolutional Neural Network for Face Hallucination Attribute Augmented Convolutional Neural Network for Face Hallucination Cheng-Han Lee 1 Kaipeng Zhang 1 Hu-Cheng Lee 1 Chia-Wen Cheng 2 Winston Hsu 1 1 National Taiwan University 2 The University of Texas

More information

Kaggle Data Science Bowl 2017 Technical Report

Kaggle Data Science Bowl 2017 Technical Report Kaggle Data Science Bowl 2017 Technical Report qfpxfd Team May 11, 2017 1 Team Members Table 1: Team members Name E-Mail University Jia Ding dingjia@pku.edu.cn Peking University, Beijing, China Aoxue Li

More information

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,

More information

A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS. Kuan-Chuan Peng and Tsuhan Chen

A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS. Kuan-Chuan Peng and Tsuhan Chen A FRAMEWORK OF EXTRACTING MULTI-SCALE FEATURES USING MULTIPLE CONVOLUTIONAL NEURAL NETWORKS Kuan-Chuan Peng and Tsuhan Chen School of Electrical and Computer Engineering, Cornell University, Ithaca, NY

More information

arxiv: v1 [cs.cv] 26 Nov 2017

arxiv: v1 [cs.cv] 26 Nov 2017 Semantically Consistent Image Completion with Fine-grained Details Pengpeng Liu, Xiaojuan Qi, Pinjia He, Yikang Li, Michael R. Lyu, Irwin King The Chinese University of Hong Kong, Hong Kong SAR, China

More information

Machine Learning 13. week

Machine Learning 13. week Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of

More information

Colorization Using ConvNet and GAN

Colorization Using ConvNet and GAN Colorization Using ConvNet and GAN Qiwen Fu qiwenfu@stanford.edu Stanford Univeristy Wei-Ting Hsu hsuwt@stanford.edu Stanford University Mu-Heng Yang mhyang@stanford.edu Stanford University Abstract Colorization

More information

Study of Residual Networks for Image Recognition

Study of Residual Networks for Image Recognition Study of Residual Networks for Image Recognition Mohammad Sadegh Ebrahimi Stanford University sadegh@stanford.edu Hossein Karkeh Abadi Stanford University hosseink@stanford.edu Abstract Deep neural networks

More information

3D Deep Learning

3D Deep Learning 3D Deep Learning Tutorial@CVPR2017 Hao Su (UCSD) Leonidas Guibas (Stanford) Michael Bronstein (Università della Svizzera Italiana) Evangelos Kalogerakis (UMass) Jimei Yang (Adobe Research) Charles Qi (Stanford)

More information

Learning from 3D Data

Learning from 3D Data Learning from 3D Data Thomas Funkhouser Princeton University* * On sabbatical at Stanford and Google Disclaimer: I am talking about the work of these people Shuran Song Andy Zeng Fisher Yu Yinda Zhang

More information

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

COMP 551 Applied Machine Learning Lecture 16: Deep Learning COMP 551 Applied Machine Learning Lecture 16: Deep Learning Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted, all

More information

ASCII Art Synthesis with Convolutional Networks

ASCII Art Synthesis with Convolutional Networks ASCII Art Synthesis with Convolutional Networks Osamu Akiyama Faculty of Medicine, Osaka University oakiyama1986@gmail.com 1 Introduction ASCII art is a type of graphic art that presents a picture with

More information

Neural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders

Neural Networks for Machine Learning. Lecture 15a From Principal Components Analysis to Autoencoders Neural Networks for Machine Learning Lecture 15a From Principal Components Analysis to Autoencoders Geoffrey Hinton Nitish Srivastava, Kevin Swersky Tijmen Tieleman Abdel-rahman Mohamed Principal Components

More information

Object Localization, Segmentation, Classification, and Pose Estimation in 3D Images using Deep Learning

Object Localization, Segmentation, Classification, and Pose Estimation in 3D Images using Deep Learning Allan Zelener Dissertation Proposal December 12 th 2016 Object Localization, Segmentation, Classification, and Pose Estimation in 3D Images using Deep Learning Overview 1. Introduction to 3D Object Identification

More information

Structure-oriented Networks of Shape Collections

Structure-oriented Networks of Shape Collections Structure-oriented Networks of Shape Collections Noa Fish 1 Oliver van Kaick 2 Amit Bermano 3 Daniel Cohen-Or 1 1 Tel Aviv University 2 Carleton University 3 Princeton University 1 pplementary material

More information

Perceptron: This is convolution!

Perceptron: This is convolution! Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

Restricted Boltzmann Machines. Shallow vs. deep networks. Stacked RBMs. Boltzmann Machine learning: Unsupervised version

Restricted Boltzmann Machines. Shallow vs. deep networks. Stacked RBMs. Boltzmann Machine learning: Unsupervised version Shallow vs. deep networks Restricted Boltzmann Machines Shallow: one hidden layer Features can be learned more-or-less independently Arbitrary function approximator (with enough hidden units) Deep: two

More information

Tutorial on Keras CAP ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY

Tutorial on Keras CAP ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY Tutorial on Keras CAP 6412 - ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY Deep learning packages TensorFlow Google PyTorch Facebook AI research Keras Francois Chollet (now at Google) Chainer Company

More information

Generating Images with Perceptual Similarity Metrics based on Deep Networks

Generating Images with Perceptual Similarity Metrics based on Deep Networks Generating Images with Perceptual Similarity Metrics based on Deep Networks Alexey Dosovitskiy and Thomas Brox University of Freiburg {dosovits, brox}@cs.uni-freiburg.de Abstract We propose a class of

More information

GAN Frontiers/Related Methods

GAN Frontiers/Related Methods GAN Frontiers/Related Methods Improving GAN Training Improved Techniques for Training GANs (Salimans, et. al 2016) CSC 2541 (07/10/2016) Robin Swanson (robin@cs.toronto.edu) Training GANs is Difficult

More information

Object Removal Using Exemplar-Based Inpainting

Object Removal Using Exemplar-Based Inpainting CS766 Prof. Dyer Object Removal Using Exemplar-Based Inpainting Ye Hong University of Wisconsin-Madison Fall, 2004 Abstract Two commonly used approaches to fill the gaps after objects are removed from

More information

Accelerating Convolutional Neural Nets. Yunming Zhang

Accelerating Convolutional Neural Nets. Yunming Zhang Accelerating Convolutional Neural Nets Yunming Zhang Focus Convolutional Neural Nets is the state of the art in classifying the images The models take days to train Difficult for the programmers to tune

More information

SiftingGAN: Generating and Sifting Labeled Samples to Improve the Remote Sensing Image Scene Classification Baseline in vitro

SiftingGAN: Generating and Sifting Labeled Samples to Improve the Remote Sensing Image Scene Classification Baseline in vitro 1 SiftingGAN: Generating and Sifting Labeled Samples to Improve the Remote Sensing Image Scene Classification Baseline in vitro Dongao Ma, Ping Tang, and Lijun Zhao arxiv:1809.04985v4 [cs.cv] 30 Nov 2018

More information

Real-Time Depth Estimation from 2D Images

Real-Time Depth Estimation from 2D Images Real-Time Depth Estimation from 2D Images Jack Zhu Ralph Ma jackzhu@stanford.edu ralphma@stanford.edu. Abstract ages. We explore the differences in training on an untrained network, and on a network pre-trained

More information

CRF Based Point Cloud Segmentation Jonathan Nation

CRF Based Point Cloud Segmentation Jonathan Nation CRF Based Point Cloud Segmentation Jonathan Nation jsnation@stanford.edu 1. INTRODUCTION The goal of the project is to use the recently proposed fully connected conditional random field (CRF) model to

More information