Lecture 10: Other Applications of CNNs

Size: px
Start display at page:

Download "Lecture 10: Other Applications of CNNs"

Transcription

1 Applications of Convolutional Neural Networks CSED703: Deep Learning for Visual ecognition (207F) Lecture 0: Other Applications of CNNs Bohyung Han Computer Vision Lab. Face recognition and verification Person reidentification Text region detection Style transfer Object generation Visual attention and saliency Visual analogy Autonomous driving Many others 2 Neural Artistic Style Method Main goal Synthesizing two images representing both content and style Exploiting a pretrained CNN for image classification CNN VGG 9 layer net without fully connected layers No fine-tuning Average pooling: improves gradient flow and get more appealing results Input: content Input: style + Loss in feature map CNN Final Output output Loss in feature map correlation CVP CVP 206 4

2 Optimization Optimization Loss Error back-propagation ABCBDE!, ', % = FAGCHBIHB!, %, $ + KALBMEI ', %? AGCHBIHB!, %, $ = P &#;< "#;< 2 ;,< U? # ;,< S# ALBMEI ', % = P P 2 40# # Content: select a particular layer such? as conv4_2 AGCHBIHB!, %, $ = P &#;< "#;< 2 Update rule:! and "# : original content image and its feature map in the $-th layer % and &# : generated image and its feature map in the $-th layer ' and (# : original style image and its feature map in the $-th layer &#, "#, (# ℝ,- /-, where 0# is the number of feature maps and # is the size of feature map # = width# height # = = #?> &#;> &<>?> (#;> (#<> CVP YAGCHBIHB =Z Y&#;< ;,< "#;< 0 if &#;< 0 if &#;< < 0 Style: use conv*_ with equal weights (S# = 0.2) ALBMEI ℝ,-,- : correlation of feature maps in the $-th layer ℝ,-,- : correlation of feature maps in the $-th layer &#;< Update rule: U ', % = P S# X# 2 YALBMEI Y&#;< # X# =? P 40# # ;,< YALBMEI YX# S &# = = _2 # <; ;< # YX# Y&;< 0 if &#;< 0 if &#;< < 0 CVP Generated Images More Examples Style: The Starry Night Source Style2: The Scream CVP CVP 206 8

3 Balance between Content and Style Multiple Styles Style: The Starry Night Source Style2: The Scream CVP CVP Face Verification Definition Given two faces, determine whether they are same person or not. Binary decision by one-to-one matching elated problem Face detection: finding faces Face recognition: multi-class classification problem Standard pipeline Siamese Network Deep Discriminative Metric Learning (DDML) Learning a distance metric `a % ;, % < = b % ; b % < epresentation learning Two branches share weights. Objective $ ;< c `a % ;, % < > $ ;< = : same ID $ ;< = : different ID Face Detection Face Alignment Feature Extraction Binary Classification c > [Hu204] J. Hu, J. Lu, Y.-P. Tan: Discriminative Deep Metric Learning for Face Verification in the Wild. CVP 204 2

4 ? DeepID I CNN Architecture Deep hidden IDentity features (DeepID) 97.45% verification accuracy: as good as human performance (97.53%) CNN architecture Multiple scales e < = max 0, P i j ; S j ;,< + i ; S ;,< + k < ; [Sun204a] Y. Sun, X. Wang, X. Tang: Deep Learning Face epresentation from Predicting 0,000 Classes. CVP Joint Bayesian Verification Algorithm % = l + m, where l is face identity and n is intra-class variation Comparison between Two Verifiers Joint Bayesian is better than neural network. l ~ 0 0, p l and m ~ 0 0, p m Compute q % r, % s = log v % r, % s w x v % r, % s w y, which has a closed-form solution. Neural networks Test accuracy (%) Number of classes for training 5 highly-correlated subfeature (640D) 60 groups 6

5 7 esults Comparison of state-of-the-art face verification methods on LFW No. of outside images Method Accuracy (%) No. of points No. of images Feature dimension Joint Bayesian [8] (o) 5 99, ConvNet-BM [3] (o) 3 87,628 N/A CMD+SLBP [7] (u) 3 N/A 2302 Fisher vector faces [29] (u) 9 N/A 28 2 Tom-vs-Pete classifiers [2] (o+r) 95 20, High-dim LBP [9] 95.7 (o) 27 99, TL Joint Bayesian [6] (o+u) 27 99, DeepFace [32] (o+u) ,400, ,000, DeepID on CelebFaces (o) 5 87, DeepID on CelebFaces (o) 5 202, DeepID on CelebFaces+ & TL (o+u) 5 202, r: restricted training protocol, where 6000 face pairs given by LFW are used for 0-fold cross-validation u: unrestricted training protocol, where more training pairs can be generated from LFW using identity o: using outside training data, however, without using training data from LFW o+r: using both outside data and LFW data in the restricted protocol for training o+u: using both outside data and LFW data in the unrestricted protocol for training TL: Joint Bayesian transfer learning from CelebFaces+ to LFW Human-level performance: DeepID II Joint identification-verification Face identification: increases the inter-personal variations by drawing DeepID2 features extracted from different identities apart Face verification: reduces the intra-personal variations by pulling DeepID2 features extracted from the same identity together Feature extraction b = Conv i; ~ [Sun204b] Y. Sun, Y. Chen, X. Wang, X. Tang: Deep Learning Face epresentation by Joint Identification- Verification. NIPS Two loss functions Identification loss: cross-entropy Ident b, Å; ~ ;Ç Verification loss Verif b ;, b <, e ;< ; ~ äã = ~ äã = å Training CNN Ö = P É ; log É ; = log É á ;Üj 2 b ; b < 2 max 0, å b ; b < b = Conv i; ~ ~ ;Ç : parameters of softmax layer if e ;< = if e ;< = 20 Verification Algorithm Feature extraction Detect 2 facial landmarks by SDM algorithm and align faces globally Crop 400 face patches with variations in positions, scales, color channels, and horizontal flipping ConvNet 200 CNNs: generate 400 DeepID2 feature vectors with horizontal flipping Feature vector: 60D Feature dimensionality reduction Select 25 patches in a greedy manner PCA from 25x60D to 80D Selected 25 face patches Joint Bayesian for verification

6 method accuracy (%) High-dim LBP [4] 95.7 ±.3 TL Joint Bayesian [2] ±.08 DeepFace [2] ± 0.25 DeepID [20] ± 0.26 GaussianFace [3] ± 0.66 DeepID ± 0.3 Human-level performance: esults on LFW FaceNet Architecture Direct mapping between face images and embedded points Triplet loss Using large margin nearest neighbor (LMNN) 2 [Schroff5] F. Schroff, D. Kalenichenko, J. Philbin: FaceNet: A Unified Embedding for Face ecognition and Clustering. CVP Discriminative vs. Generative CNN Goal Discriminative CNN Generative CNN Object class Viewpoint Style CNN Object class Viewpoint Style Generate an object based on high-level inputs such as Class Orientation with respect to camera Additional parameters otation, translation, zoom Stretching horizontally or vertically Hue, saturation, brightness Knowledge transfer Generative CNN learns the manifold of chairs. Interpolation between viewpoints and different objects [Dosovitskiy5] A. Dosovitskiy, J. T. Springenberg, T. Brox: Learning to Generate Chairs with Convolutional Neural Networks. CVP 205

7 Data Using 3D chair model dataset [Aubry4] Original dataset: 393 chair models, 62 viewpoints, 3 azimuth angles, 2 elevation angles Sanitized version: 809 models, tight cropping, resizing to 28x28 Notations ç = é j, è j, ~ j, é, è, ~,, é,, è,, ~, é: class label è: viewpoint ~: additional parameters ë = % j, í j, %, í,, %,, í, %: target GB output image í: segmentation mask h Network Architecture î 32M parameters altogether [Aubry4] M. Aubry, D. Maturana, A. Efros, and J. Sivic, Seeing 3D Chairs: Exemplar Part-based 2D-3D Alignment using a Large Dataset of CAD Models. CVP ï = î h Operations Training Unpooling: 2x2 Deconvolution: 5x5 Fixed location unpooling Objective function Minimizing the Euclidean error in 2D of, econstruction of the segmented-out chair image Segmentation mask, í min P ò î ôöõ h é ;, è ;, ~ ; ú ó ù û % ; í ; + îli h é ;, è ;, ~ ; ú ù û í ; ;Üj Visualization of uconv-3 layer filters in 28x28 network GB stream elu Segmentation stream 27 [Saxe4] A. M. Saxe, J. L. McClelland, and S. Ganguli, Learning a Nonlinear Embedding by Preserving Class Neighbourhood. ICL

8 Network Capacity Morphing Different Chairs Translation otation Zoom Stretch Saturation Brightness Color Viewpoints in training set Autonomous Driving Two previous approaches Mediated perception: parsing the entire scene to make a driving decision (e.g., Mobileye, Google) Behavior reflex: directly mapping an input image to a driving decision by an regressor (ALVINN, LeCun et al.) Mediated Perception Input Image Behavior eflex Direct Perception (ours) Driving Control Deep Driving Direct perception Estimating the affordance for driving Simple input to model using a few key perception indicators Compact yet complete descriptions of the scene for vehicle control Approach Built upon deep convolutional neural network Trained and tested on TOCS (The Open acing Car Simulator) Learning for estimating affordance related to autonomous driving Simpler than the mediated perception approach More interpretable than the typical behavior reflex approach [Chen5] C. Chen, A. Seff, A. Kornhauser, J. Xiao: DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving. ICCV

9 Platform Convolutional Neural Network System architecture TOCS Image & Speed Write ead Shared Memory ead Driving Controls Image Speed CNN Driving Controller Environment Focusing on highway driving with multiple lanes Three configurations: a road of one lane, two lanes, or three lanes ead Write angle tomarking... dist... Controller Output Prediction of affordance indicator angle CNN angle tomarking_ll dist_ll tomarking_l dist_l always in lane system on marking system on marking system activate range overlapping area tomarking_ml tomarking_m dist_mm dist_ll dist_ tomarking_m tomarking_ dist_l dist_ in lane system activate range tomarking_ll tomarking_ tomarking_l 33 (a) one-lane (b) two-lane, left (c) two-lane, right (d) three-lane (e) inner lane mark. (f) boundary lane mark. 34 (a) angle (b) in lane: tomarking (c) in lane: dist (d) on mark.: tomarking (e) on marking: dist (f) overlapping area Visualization of Learned Models Deep Driving Demo esponse map of KITTI-based ConvNet model 35 esponse map of TOCS-based ConvNet model 36 C. Chen, A. Seff, A. Kornhauser,J. Xiao: DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving. ICCV 205

10 Analogy Visual Analogy Making PAIS: FANCE :: BEIJING: CHINA : : : : Changing color China : : : : Changing shape France Beijing : : : : Changing size Paris : : : :? 37 Slide credit: Scott eed 38 Slide credit: Scott eed Visual Analogy Making Architecture Concept Learns an encoder function b: mapping images into a space, where analogies can be performed Learns a decoder ï: mapping back to the image space Infer elationship Transform query L D = argmax,æ,,ç Ø ` ï b k b + b 39 ` = argmax cos b S, b k b + b [eed5] S. eed, Y. Zhang, Y. Zhang, H. Lee: Deep Visual Analogy Making. NIPS L ±E = argmax,æ,,ç Ø L II = argmax,æ,,ç Ø ` ï b + j b k b ` ï b + h b k b ; b b

11 ?? 4 Optimization egularization For accurate analogy completion by image manifold traversing Making transformation match the difference of encoder embeddings = P b ` b ú b, b k, b,æ,,ç Ø e i for L D ú i, e, µ = _ j e i µ for L ±E MLP e i; µ for L II Training With backpropagation using SGD Combined loss: L + F, F = 0.0 Algorithm : Manifold traversal by analogy, Given ih images f a, ib, c, fand N i (# steps) (E 5) z f(c) for i =to N do z z + T (f(a),f(b),z) x i g(z) return generated images x i (i =,..., N) 42 Training With backpropagation using SGD Combined loss: L + F, F = 0.0 Optimization egularization For accurate analogy completion by image manifold traversing Making transformation match the difference of encoder embeddings = P b ` b ú b, b k, b,æ,,ç Ø Algorithm : Manifold traversal by analogy, Given ih images f a, ib, c, fand N i (# steps) (E 5) z f(c) for i =to N do z z + T (f(a),f(b),z) x i g(z) return generated images x i (i =,..., N) e i for L D ú i, e, µ = _ j e i µ for L ±E MLP e i; µ for L II Shape Predictions: Additive Model Shape Predictions: Multiplicative Model rotate rotate scale scale shift ref out query t= predictions t=2 t=3 t=4 shift ref out query t= predictions t=2 t=3 t=

12 Shape Predictions: Deep Model esults for Analogy Models Transforming shapes rotate Model otation steps Scaling steps Translation steps L add L mul L deep scale shift ref +rot (gt) query +rot +rot +rot +rot ref out query t= predictions t=2 t=3 t=4 ref +scl (gt) query +scl +scl +scl +scl ref +trans (gt) query +trans +trans +trans +trans Learning Disentangled Features Disentangling and Analogy Making Objective function L πl = argmax,æ, ï ª b + ª b k a Pose b Identity Pose Increment function T Identity 47 Algorithm 2: Disentangling training update. The switches s determine which units from f(a) and f(b) are used to reconstruct image c. Given input images a, b and target c Given switches s 2 {0, } K z s f(a)+( s) f(b) ( ) g(z) c c Pose Identity Disentangling identity d Slide credit: Scott eed

13 a Classification and Analogy Making Pose esults for Disentangled Features Transferring animation Disentangling pose from identity Pose transformations are modeled by deep additive interactions b Identity Pose Increment function T Identity c Pose Identity Attribute classifier Separate classification for identity d Model spellcast thrust walk slash shoot average L add L dis L dis+cls Slide credit: Scott eed 50 esults for Extrapolation ref output query Summary Proposing novel deep architectures that can perform visual analogy making by simple operations in an embedding space Convolutional encoder-decoder networks Modeling transformations by vector addition in embedding space works for simple problems, but multi-layer neural networks are better. walk ref. output query predictions Combining analogy and disentangling training methods Analogy representations can overcome limitations of disentangled representations by learning transformation manifold. thrust rotate 5 52

14 53

Structured Prediction using Convolutional Neural Networks

Structured Prediction using Convolutional Neural Networks Overview Structured Prediction using Convolutional Neural Networks Bohyung Han bhhan@postech.ac.kr Computer Vision Lab. Convolutional Neural Networks (CNNs) Structured predictions for low level computer

More information

Deconvolutions in Convolutional Neural Networks

Deconvolutions in Convolutional Neural Networks Overview Deconvolutions in Convolutional Neural Networks Bohyung Han bhhan@postech.ac.kr Computer Vision Lab. Convolutional Neural Networks (CNNs) Deconvolutions in CNNs Applications Network visualization

More information

Deep Convolutional Neural Network using Triplet of Faces, Deep Ensemble, and Scorelevel Fusion for Face Recognition

Deep Convolutional Neural Network using Triplet of Faces, Deep Ensemble, and Scorelevel Fusion for Face Recognition IEEE 2017 Conference on Computer Vision and Pattern Recognition Deep Convolutional Neural Network using Triplet of Faces, Deep Ensemble, and Scorelevel Fusion for Face Recognition Bong-Nam Kang*, Yonghyun

More information

Deep Learning for Face Recognition. Xiaogang Wang Department of Electronic Engineering, The Chinese University of Hong Kong

Deep Learning for Face Recognition. Xiaogang Wang Department of Electronic Engineering, The Chinese University of Hong Kong Deep Learning for Face Recognition Xiaogang Wang Department of Electronic Engineering, The Chinese University of Hong Kong Deep Learning Results on LFW Method Accuracy (%) # points # training images Huang

More information

Multi-view 3D Models from Single Images with a Convolutional Network

Multi-view 3D Models from Single Images with a Convolutional Network Multi-view 3D Models from Single Images with a Convolutional Network Maxim Tatarchenko University of Freiburg Skoltech - 2nd Christmas Colloquium on Computer Vision Humans have prior knowledge about 3D

More information

FaceNet. Florian Schroff, Dmitry Kalenichenko, James Philbin Google Inc. Presentation by Ignacio Aranguren and Rahul Rana

FaceNet. Florian Schroff, Dmitry Kalenichenko, James Philbin Google Inc. Presentation by Ignacio Aranguren and Rahul Rana FaceNet Florian Schroff, Dmitry Kalenichenko, James Philbin Google Inc. Presentation by Ignacio Aranguren and Rahul Rana Introduction FaceNet learns a mapping from face images to a compact Euclidean Space

More information

ECCV Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016

ECCV Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016 ECCV 2016 Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016 Fundamental Question What is a good vector representation of an object? Something that can be easily predicted from 2D

More information

Face Recognition A Deep Learning Approach

Face Recognition A Deep Learning Approach Face Recognition A Deep Learning Approach Lihi Shiloh Tal Perl Deep Learning Seminar 2 Outline What about Cat recognition? Classical face recognition Modern face recognition DeepFace FaceNet Comparison

More information

Lecture 7: Semantic Segmentation

Lecture 7: Semantic Segmentation Semantic Segmentation CSED703R: Deep Learning for Visual Recognition (207F) Segmenting images based on its semantic notion Lecture 7: Semantic Segmentation Bohyung Han Computer Vision Lab. bhhanpostech.ac.kr

More information

Robust Face Recognition Based on Convolutional Neural Network

Robust Face Recognition Based on Convolutional Neural Network 2017 2nd International Conference on Manufacturing Science and Information Engineering (ICMSIE 2017) ISBN: 978-1-60595-516-2 Robust Face Recognition Based on Convolutional Neural Network Ying Xu, Hui Ma,

More information

on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015

on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015 on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015 Vector visual representation Fixed-size image representation High-dim (100 100,000) Generic, unsupervised: BoW,

More information

LEARNING TO GENERATE CHAIRS WITH CONVOLUTIONAL NEURAL NETWORKS

LEARNING TO GENERATE CHAIRS WITH CONVOLUTIONAL NEURAL NETWORKS LEARNING TO GENERATE CHAIRS WITH CONVOLUTIONAL NEURAL NETWORKS Alexey Dosovitskiy, Jost Tobias Springenberg and Thomas Brox University of Freiburg Presented by: Shreyansh Daftry Visual Learning and Recognition

More information

Deep Face Recognition. Nathan Sun

Deep Face Recognition. Nathan Sun Deep Face Recognition Nathan Sun Why Facial Recognition? Picture ID or video tracking Higher Security for Facial Recognition Software Immensely useful to police in tracking suspects Your face will be an

More information

DL Tutorial. Xudong Cao

DL Tutorial. Xudong Cao DL Tutorial Xudong Cao Historical Line 1960s Perceptron 1980s MLP BP algorithm 2006 RBM unsupervised learning 2012 AlexNet ImageNet Comp. 2014 GoogleNet VGGNet ImageNet Comp. Rule based AI algorithm Game

More information

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU, Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image

More information

Spatial Localization and Detection. Lecture 8-1

Spatial Localization and Detection. Lecture 8-1 Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday

More information

Three-Dimensional Object Detection and Layout Prediction using Clouds of Oriented Gradients

Three-Dimensional Object Detection and Layout Prediction using Clouds of Oriented Gradients ThreeDimensional Object Detection and Layout Prediction using Clouds of Oriented Gradients Authors: Zhile Ren, Erik B. Sudderth Presented by: Shannon Kao, Max Wang October 19, 2016 Introduction Given an

More information

3D Shape Analysis with Multi-view Convolutional Networks. Evangelos Kalogerakis

3D Shape Analysis with Multi-view Convolutional Networks. Evangelos Kalogerakis 3D Shape Analysis with Multi-view Convolutional Networks Evangelos Kalogerakis 3D model repositories [3D Warehouse - video] 3D geometry acquisition [KinectFusion - video] 3D shapes come in various flavors

More information

Bilinear Models for Fine-Grained Visual Recognition

Bilinear Models for Fine-Grained Visual Recognition Bilinear Models for Fine-Grained Visual Recognition Subhransu Maji College of Information and Computer Sciences University of Massachusetts, Amherst Fine-grained visual recognition Example: distinguish

More information

Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization. Presented by: Karen Lucknavalai and Alexandr Kuznetsov

Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization. Presented by: Karen Lucknavalai and Alexandr Kuznetsov Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization Presented by: Karen Lucknavalai and Alexandr Kuznetsov Example Style Content Result Motivation Transforming content of an image

More information

Clustering Lightened Deep Representation for Large Scale Face Identification

Clustering Lightened Deep Representation for Large Scale Face Identification Clustering Lightened Deep Representation for Large Scale Face Identification Shilun Lin linshilun@bupt.edu.cn Zhicheng Zhao zhaozc@bupt.edu.cn Fei Su sufei@bupt.edu.cn ABSTRACT On specific face dataset,

More information

Vision based autonomous driving - A survey of recent methods. -Tejus Gupta

Vision based autonomous driving - A survey of recent methods. -Tejus Gupta Vision based autonomous driving - A survey of recent methods -Tejus Gupta Presently, there are three major paradigms for vision based autonomous driving: Directly map input image to driving action using

More information

Face2Face Comparing faces with applications Patrick Pérez. Inria, Rennes 2 Oct. 2014

Face2Face Comparing faces with applications Patrick Pérez. Inria, Rennes 2 Oct. 2014 Face2Face Comparing faces with applications Patrick Pérez Inria, Rennes 2 Oct. 2014 Outline Metric learning for face comparison Expandable parts model and occlusions Face sets comparison Identity-based

More information

CS230: Lecture 3 Various Deep Learning Topics

CS230: Lecture 3 Various Deep Learning Topics CS230: Lecture 3 Various Deep Learning Topics Kian Katanforoosh, Andrew Ng Today s outline We will learn how to: - Analyse a problem from a deep learning approach - Choose an architecture - Choose a loss

More information

Deep Learning with Tensorflow AlexNet

Deep Learning with Tensorflow   AlexNet Machine Learning and Computer Vision Group Deep Learning with Tensorflow http://cvml.ist.ac.at/courses/dlwt_w17/ AlexNet Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification

More information

Deep Learning for Vision

Deep Learning for Vision Deep Learning for Vision Presented by Kevin Matzen Quick Intro - DNN Feed-forward Sparse connectivity (layer to layer) Different layer types Recently popularized for vision [Krizhevsky, et. al. NIPS 2012]

More information

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

COMP 551 Applied Machine Learning Lecture 16: Deep Learning COMP 551 Applied Machine Learning Lecture 16: Deep Learning Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted, all

More information

Self Driving. DNN * * Reinforcement * Unsupervised *

Self Driving. DNN * * Reinforcement * Unsupervised * CNN 응용 Methods Traditional Deep-Learning based Non-machine Learning Machine-Learning based method Supervised SVM MLP CNN RNN (LSTM) Localizati on GPS, SLAM Self Driving Perception Pedestrian detection

More information

Alternatives to Direct Supervision

Alternatives to Direct Supervision CreativeAI: Deep Learning for Graphics Alternatives to Direct Supervision Niloy Mitra Iasonas Kokkinos Paul Guerrero Nils Thuerey Tobias Ritschel UCL UCL UCL TUM UCL Timetable Theory and Basics State of

More information

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon Deep Learning For Video Classification Presented by Natalie Carlebach & Gil Sharon Overview Of Presentation Motivation Challenges of video classification Common datasets 4 different methods presented in

More information

Mask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma

Mask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN Background Related Work Architecture Experiment Mask R-CNN Background Related Work Architecture Experiment Background From left

More information

Unsupervised Learning

Unsupervised Learning Deep Learning for Graphics Unsupervised Learning Niloy Mitra Iasonas Kokkinos Paul Guerrero Vladimir Kim Kostas Rematas Tobias Ritschel UCL UCL/Facebook UCL Adobe Research U Washington UCL Timetable Niloy

More information

Transfer Learning. Style Transfer in Deep Learning

Transfer Learning. Style Transfer in Deep Learning Transfer Learning & Style Transfer in Deep Learning 4-DEC-2016 Gal Barzilai, Ram Machlev Deep Learning Seminar School of Electrical Engineering Tel Aviv University Part 1: Transfer Learning in Deep Learning

More information

DeepFace: Closing the Gap to Human-Level Performance in Face Verification

DeepFace: Closing the Gap to Human-Level Performance in Face Verification DeepFace: Closing the Gap to Human-Level Performance in Face Verification Report on the paper Artem Komarichev February 7, 2016 Outline New alignment technique New DNN architecture New large dataset with

More information

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro CMU 15-781 Lecture 18: Deep learning and Vision: Convolutional neural networks Teacher: Gianni A. Di Caro DEEP, SHALLOW, CONNECTED, SPARSE? Fully connected multi-layer feed-forward perceptrons: More powerful

More information

Hybrid Deep Learning for Face Verification

Hybrid Deep Learning for Face Verification 2013 IEEE International Conference on Computer Vision Hybrid Deep Learning for Face Verification Yi Sun 1 Xiaogang Wang 2,3 Xiaoou Tang 1,3 1 Department of Information Engineering, The Chinese University

More information

Proceedings of the International MultiConference of Engineers and Computer Scientists 2018 Vol I IMECS 2018, March 14-16, 2018, Hong Kong

Proceedings of the International MultiConference of Engineers and Computer Scientists 2018 Vol I IMECS 2018, March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong TABLE I CLASSIFICATION ACCURACY OF DIFFERENT PRE-TRAINED MODELS ON THE TEST DATA

More information

Tom-vs-Pete Classifiers and Identity- Preserving Alignment for Face Verification. Thomas Berg Peter N. Belhumeur Columbia University

Tom-vs-Pete Classifiers and Identity- Preserving Alignment for Face Verification. Thomas Berg Peter N. Belhumeur Columbia University Tom-vs-Pete Classifiers and Identity- Preserving Alignment for Face Verification Thomas Berg Peter N. Belhumeur Columbia University 1 How can we tell people apart? 2 We can tell people apart using attributes

More information

arxiv: v1 [cs.cv] 16 Nov 2015

arxiv: v1 [cs.cv] 16 Nov 2015 Coarse-to-fine Face Alignment with Multi-Scale Local Patch Regression Zhiao Huang hza@megvii.com Erjin Zhou zej@megvii.com Zhimin Cao czm@megvii.com arxiv:1511.04901v1 [cs.cv] 16 Nov 2015 Abstract Facial

More information

END-TO-END CHINESE TEXT RECOGNITION

END-TO-END CHINESE TEXT RECOGNITION END-TO-END CHINESE TEXT RECOGNITION Jie Hu 1, Tszhang Guo 1, Ji Cao 2, Changshui Zhang 1 1 Department of Automation, Tsinghua University 2 Beijing SinoVoice Technology November 15, 2017 Presentation at

More information

3D Shape Segmentation with Projective Convolutional Networks

3D Shape Segmentation with Projective Convolutional Networks 3D Shape Segmentation with Projective Convolutional Networks Evangelos Kalogerakis 1 Melinos Averkiou 2 Subhransu Maji 1 Siddhartha Chaudhuri 3 1 University of Massachusetts Amherst 2 University of Cyprus

More information

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University.

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University. Visualizing and Understanding Convolutional Networks Christopher Pennsylvania State University February 23, 2015 Some Slide Information taken from Pierre Sermanet (Google) presentation on and Computer

More information

MULTI-LEVEL 3D CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION SAMBIT GHADAI XIAN LEE ADITYA BALU SOUMIK SARKAR ADARSH KRISHNAMURTHY

MULTI-LEVEL 3D CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION SAMBIT GHADAI XIAN LEE ADITYA BALU SOUMIK SARKAR ADARSH KRISHNAMURTHY MULTI-LEVEL 3D CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION SAMBIT GHADAI XIAN LEE ADITYA BALU SOUMIK SARKAR ADARSH KRISHNAMURTHY Outline Object Recognition Multi-Level Volumetric Representations

More information

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta Encoder-Decoder Networks for Semantic Segmentation Sachin Mehta Outline > Overview of Semantic Segmentation > Encoder-Decoder Networks > Results What is Semantic Segmentation? Input: RGB Image Output:

More information

Hybrid Deep Learning for Face Verification. Yi Sun, Xiaogang Wang, Member, IEEE, and Xiaoou Tang, Fellow, IEEE

Hybrid Deep Learning for Face Verification. Yi Sun, Xiaogang Wang, Member, IEEE, and Xiaoou Tang, Fellow, IEEE 1 Hybrid Deep Learning for Face Verification Yi Sun, Xiaogang Wang, Member, IEEE, and Xiaoou Tang, Fellow, IEEE Abstract This paper proposes a hybrid convolutional network (ConvNet)-Restricted Boltzmann

More information

Learning from 3D Data

Learning from 3D Data Learning from 3D Data Thomas Funkhouser Princeton University* * On sabbatical at Stanford and Google Disclaimer: I am talking about the work of these people Shuran Song Andy Zeng Fisher Yu Yinda Zhang

More information

Learning to Recognize Faces in Realistic Conditions

Learning to Recognize Faces in Realistic Conditions 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Deep Learning for Computer Vision II

Deep Learning for Computer Vision II IIIT Hyderabad Deep Learning for Computer Vision II C. V. Jawahar Paradigm Shift Feature Extraction (SIFT, HoG, ) Part Models / Encoding Classifier Sparrow Feature Learning Classifier Sparrow L 1 L 2 L

More information

Learning based face hallucination techniques: A survey

Learning based face hallucination techniques: A survey Vol. 3 (2014-15) pp. 37-45. : A survey Premitha Premnath K Department of Computer Science & Engineering Vidya Academy of Science & Technology Thrissur - 680501, Kerala, India (email: premithakpnath@gmail.com)

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period starts

More information

Deep Learning for Computer Vision with MATLAB By Jon Cherrie

Deep Learning for Computer Vision with MATLAB By Jon Cherrie Deep Learning for Computer Vision with MATLAB By Jon Cherrie 2015 The MathWorks, Inc. 1 Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deeplearning system. 'We

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Announcements Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Seminar registration period starts on Friday We will offer a lab course in the summer semester Deep Robot Learning Topic:

More information

An Associate-Predict Model for Face Recognition FIPA Seminar WS 2011/2012

An Associate-Predict Model for Face Recognition FIPA Seminar WS 2011/2012 An Associate-Predict Model for Face Recognition FIPA Seminar WS 2011/2012, 19.01.2012 INSTITUTE FOR ANTHROPOMATICS, FACIAL IMAGE PROCESSING AND ANALYSIS YIG University of the State of Baden-Wuerttemberg

More information

Object Recognition. Lecture 11, April 21 st, Lexing Xie. EE4830 Digital Image Processing

Object Recognition. Lecture 11, April 21 st, Lexing Xie. EE4830 Digital Image Processing Object Recognition Lecture 11, April 21 st, 2008 Lexing Xie EE4830 Digital Image Processing http://www.ee.columbia.edu/~xlx/ee4830/ 1 Announcements 2 HW#5 due today HW#6 last HW of the semester Due May

More information

arxiv: v1 [cs.cv] 5 Jul 2017

arxiv: v1 [cs.cv] 5 Jul 2017 AlignGAN: Learning to Align Cross- Images with Conditional Generative Adversarial Networks Xudong Mao Department of Computer Science City University of Hong Kong xudonmao@gmail.com Qing Li Department of

More information

Cost-alleviative Learning for Deep Convolutional Neural Network-based Facial Part Labeling

Cost-alleviative Learning for Deep Convolutional Neural Network-based Facial Part Labeling [DOI: 10.2197/ipsjtcva.7.99] Express Paper Cost-alleviative Learning for Deep Convolutional Neural Network-based Facial Part Labeling Takayoshi Yamashita 1,a) Takaya Nakamura 1 Hiroshi Fukui 1,b) Yuji

More information

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies http://blog.csdn.net/zouxy09/article/details/8775360 Automatic Colorization of Black and White Images Automatically Adding Sounds To Silent Movies Traditionally this was done by hand with human effort

More information

SHIV SHAKTI International Journal in Multidisciplinary and Academic Research (SSIJMAR) Vol. 7, No. 2, April 2018 (ISSN )

SHIV SHAKTI International Journal in Multidisciplinary and Academic Research (SSIJMAR) Vol. 7, No. 2, April 2018 (ISSN ) SHIV SHAKTI International Journal in Multidisciplinary and Academic Research (SSIJMAR) Vol. 7, No. 2, April 2018 (ISSN 2278 5973) Facial Recognition Using Deep Learning Rajeshwar M, Sanjit Singh Chouhan,

More information

VISION FOR AUTOMOTIVE DRIVING

VISION FOR AUTOMOTIVE DRIVING VISION FOR AUTOMOTIVE DRIVING French Japanese Workshop on Deep Learning & AI, Paris, October 25th, 2017 Quoc Cuong PHAM, PhD Vision and Content Engineering Lab AI & MACHINE LEARNING FOR ADAS AND SELF-DRIVING

More information

Deep Learning for Visual Manipulation and Synthesis

Deep Learning for Visual Manipulation and Synthesis Deep Learning for Visual Manipulation and Synthesis Jun-Yan Zhu 朱俊彦 UC Berkeley 2017/01/11 @ VALSE What is visual manipulation? Image Editing Program input photo User Input result Desired output: stay

More information

Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs

Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs Introduction to Deep Learning for Facial Understanding Part III: Regional CNNs Raymond Ptucha, Rochester Institute of Technology, USA Tutorial-9 May 19, 218 www.nvidia.com/dli R. Ptucha 18 1 Fair Use Agreement

More information

Lecture 5: Object Detection

Lecture 5: Object Detection Object Detection CSED703R: Deep Learning for Visual Recognition (2017F) Lecture 5: Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 Traditional Object Detection Algorithms Region-based

More information

Face Recognition by Deep Learning - The Imbalance Problem

Face Recognition by Deep Learning - The Imbalance Problem Face Recognition by Deep Learning - The Imbalance Problem Chen-Change LOY MMLAB The Chinese University of Hong Kong Homepage: http://personal.ie.cuhk.edu.hk/~ccloy/ Twitter: https://twitter.com/ccloy CVPR

More information

Dynamic Routing Between Capsules

Dynamic Routing Between Capsules Report Explainable Machine Learning Dynamic Routing Between Capsules Author: Michael Dorkenwald Supervisor: Dr. Ullrich Köthe 28. Juni 2018 Inhaltsverzeichnis 1 Introduction 2 2 Motivation 2 3 CapusleNet

More information

Understanding Faces. Detection, Recognition, and. Transformation of Faces 12/5/17

Understanding Faces. Detection, Recognition, and. Transformation of Faces 12/5/17 Understanding Faces Detection, Recognition, and 12/5/17 Transformation of Faces Lucas by Chuck Close Chuck Close, self portrait Some slides from Amin Sadeghi, Lana Lazebnik, Silvio Savarese, Fei-Fei Li

More information

Object Detection. Part1. Presenter: Dae-Yong

Object Detection. Part1. Presenter: Dae-Yong Object Part1 Presenter: Dae-Yong Contents 1. What is an Object? 2. Traditional Object Detector 3. Deep Learning-based Object Detector What is an Object? Subset of Object Recognition What is an Object?

More information

RECURRENT NEURAL NETWORKS

RECURRENT NEURAL NETWORKS RECURRENT NEURAL NETWORKS Methods Traditional Deep-Learning based Non-machine Learning Machine-Learning based method Supervised SVM MLP CNN RNN (LSTM) Localizati on GPS, SLAM Self Driving Perception Pedestrian

More information

Lecture 12 Recognition

Lecture 12 Recognition Institute of Informatics Institute of Neuroinformatics Lecture 12 Recognition Davide Scaramuzza 1 Lab exercise today replaced by Deep Learning Tutorial Room ETH HG E 1.1 from 13:15 to 15:00 Optional lab

More information

Predicting ground-level scene Layout from Aerial imagery. Muhammad Hasan Maqbool

Predicting ground-level scene Layout from Aerial imagery. Muhammad Hasan Maqbool Predicting ground-level scene Layout from Aerial imagery Muhammad Hasan Maqbool Objective Given the overhead image predict its ground level semantic segmentation Predicted ground level labeling Overhead/Aerial

More information

Generative Adversarial Text to Image Synthesis

Generative Adversarial Text to Image Synthesis Generative Adversarial Text to Image Synthesis Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee Presented by: Jingyao Zhan Contents Introduction Related Work Method

More information

Joint Object Detection and Viewpoint Estimation using CNN features

Joint Object Detection and Viewpoint Estimation using CNN features Joint Object Detection and Viewpoint Estimation using CNN features Carlos Guindel, David Martín and José M. Armingol cguindel@ing.uc3m.es Intelligent Systems Laboratory Universidad Carlos III de Madrid

More information

Real-time Object Detection CS 229 Course Project

Real-time Object Detection CS 229 Course Project Real-time Object Detection CS 229 Course Project Zibo Gong 1, Tianchang He 1, and Ziyi Yang 1 1 Department of Electrical Engineering, Stanford University December 17, 2016 Abstract Objection detection

More information

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides Deep Learning in Visual Recognition Thanks Da Zhang for the slides Deep Learning is Everywhere 2 Roadmap Introduction Convolutional Neural Network Application Image Classification Object Detection Object

More information

FACIAL POINT DETECTION BASED ON A CONVOLUTIONAL NEURAL NETWORK WITH OPTIMAL MINI-BATCH PROCEDURE. Chubu University 1200, Matsumoto-cho, Kasugai, AICHI

FACIAL POINT DETECTION BASED ON A CONVOLUTIONAL NEURAL NETWORK WITH OPTIMAL MINI-BATCH PROCEDURE. Chubu University 1200, Matsumoto-cho, Kasugai, AICHI FACIAL POINT DETECTION BASED ON A CONVOLUTIONAL NEURAL NETWORK WITH OPTIMAL MINI-BATCH PROCEDURE Masatoshi Kimura Takayoshi Yamashita Yu Yamauchi Hironobu Fuyoshi* Chubu University 1200, Matsumoto-cho,

More information

Convolutional Neural Networks + Neural Style Transfer. Justin Johnson 2/1/2017

Convolutional Neural Networks + Neural Style Transfer. Justin Johnson 2/1/2017 Convolutional Neural Networks + Neural Style Transfer Justin Johnson 2/1/2017 Outline Convolutional Neural Networks Convolution Pooling Feature Visualization Neural Style Transfer Feature Inversion Texture

More information

Deep learning for object detection. Slides from Svetlana Lazebnik and many others

Deep learning for object detection. Slides from Svetlana Lazebnik and many others Deep learning for object detection Slides from Svetlana Lazebnik and many others Recent developments in object detection 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before deep

More information

Measuring Aristic Similarity of Paintings

Measuring Aristic Similarity of Paintings Measuring Aristic Similarity of Paintings Jay Whang Stanford SCPD jaywhang@stanford.edu Buhuang Liu Stanford SCPD buhuang@stanford.edu Yancheng Xiao Stanford SCPD ycxiao@stanford.edu Abstract In this project,

More information

Pair-wise Distance Metric Learning of Neural Network Model for Spoken Language Identification

Pair-wise Distance Metric Learning of Neural Network Model for Spoken Language Identification INTERSPEECH 2016 September 8 12, 2016, San Francisco, USA Pair-wise Distance Metric Learning of Neural Network Model for Spoken Language Identification 2 1 Xugang Lu 1, Peng Shen 1, Yu Tsao 2, Hisashi

More information

Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network. Nathan Sun CIS601

Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network. Nathan Sun CIS601 Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network Nathan Sun CIS601 Introduction Face ID is complicated by alterations to an individual s appearance Beard,

More information

3D Deep Learning on Geometric Forms. Hao Su

3D Deep Learning on Geometric Forms. Hao Su 3D Deep Learning on Geometric Forms Hao Su Many 3D representations are available Candidates: multi-view images depth map volumetric polygonal mesh point cloud primitive-based CAD models 3D representation

More information

INF 5860 Machine learning for image classification. Lecture 11: Visualization Anne Solberg April 4, 2018

INF 5860 Machine learning for image classification. Lecture 11: Visualization Anne Solberg April 4, 2018 INF 5860 Machine learning for image classification Lecture 11: Visualization Anne Solberg April 4, 2018 Reading material The lecture is based on papers: Deep Dream: https://research.googleblog.com/2015/06/inceptionism-goingdeeper-into-neural.html

More information

arxiv: v1 [cs.cv] 3 Mar 2018

arxiv: v1 [cs.cv] 3 Mar 2018 Unsupervised Learning of Face Representations Samyak Datta, Gaurav Sharma, C.V. Jawahar Georgia Institute of Technology, CVIT, IIIT Hyderabad, IIT Kanpur arxiv:1803.01260v1 [cs.cv] 3 Mar 2018 Abstract

More information

CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm

CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm Instructions This is an individual assignment. Individual means each student must hand in their

More information

When Big Datasets are Not Enough: The need for visual virtual worlds.

When Big Datasets are Not Enough: The need for visual virtual worlds. When Big Datasets are Not Enough: The need for visual virtual worlds. Alan Yuille Bloomberg Distinguished Professor Departments of Cognitive Science and Computer Science Johns Hopkins University Computational

More information

One Network to Solve Them All Solving Linear Inverse Problems using Deep Projection Models

One Network to Solve Them All Solving Linear Inverse Problems using Deep Projection Models One Network to Solve Them All Solving Linear Inverse Problems using Deep Projection Models [Supplemental Materials] 1. Network Architecture b ref b ref +1 We now describe the architecture of the networks

More information

Cross-domain Deep Encoding for 3D Voxels and 2D Images

Cross-domain Deep Encoding for 3D Voxels and 2D Images Cross-domain Deep Encoding for 3D Voxels and 2D Images Jingwei Ji Stanford University jingweij@stanford.edu Danyang Wang Stanford University danyangw@stanford.edu 1. Introduction 3D reconstruction is one

More information

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group 1D Input, 1D Output target input 2 2D Input, 1D Output: Data Distribution Complexity Imagine many dimensions (data occupies

More information

Deep Learning for Object detection & localization

Deep Learning for Object detection & localization Deep Learning for Object detection & localization RCNN, Fast RCNN, Faster RCNN, YOLO, GAP, CAM, MSROI Aaditya Prakash Sep 25, 2018 Image classification Image classification Whole of image is classified

More information

Deep Convolutional Inverse Graphics Network

Deep Convolutional Inverse Graphics Network Deep Convolutional Inverse Graphics Network Tejas D. Kulkarni* 1, William F. Whitney* 2, Pushmeet Kohli 3, Joshua B. Tenenbaum 4 1,2,4 Massachusetts Institute of Technology, Cambridge, USA 3 Microsoft

More information

Object Localization, Segmentation, Classification, and Pose Estimation in 3D Images using Deep Learning

Object Localization, Segmentation, Classification, and Pose Estimation in 3D Images using Deep Learning Allan Zelener Dissertation Proposal December 12 th 2016 Object Localization, Segmentation, Classification, and Pose Estimation in 3D Images using Deep Learning Overview 1. Introduction to 3D Object Identification

More information

Su et al. Shape Descriptors - III

Su et al. Shape Descriptors - III Su et al. Shape Descriptors - III Siddhartha Chaudhuri http://www.cse.iitb.ac.in/~cs749 Funkhouser; Feng, Liu, Gong Recap Global A shape descriptor is a set of numbers that describes a shape in a way that

More information

Seeing the unseen. Data-driven 3D Understanding from Single Images. Hao Su

Seeing the unseen. Data-driven 3D Understanding from Single Images. Hao Su Seeing the unseen Data-driven 3D Understanding from Single Images Hao Su Image world Shape world 3D perception from a single image Monocular vision a typical prey a typical predator Cited from https://en.wikipedia.org/wiki/binocular_vision

More information

DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material

DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material Yi Li 1, Gu Wang 1, Xiangyang Ji 1, Yu Xiang 2, and Dieter Fox 2 1 Tsinghua University, BNRist 2 University of Washington

More information

Geometric VLAD for Large Scale Image Search. Zixuan Wang 1, Wei Di 2, Anurag Bhardwaj 2, Vignesh Jagadesh 2, Robinson Piramuthu 2

Geometric VLAD for Large Scale Image Search. Zixuan Wang 1, Wei Di 2, Anurag Bhardwaj 2, Vignesh Jagadesh 2, Robinson Piramuthu 2 Geometric VLAD for Large Scale Image Search Zixuan Wang 1, Wei Di 2, Anurag Bhardwaj 2, Vignesh Jagadesh 2, Robinson Piramuthu 2 1 2 Our Goal 1) Robust to various imaging conditions 2) Small memory footprint

More information

CS468: 3D Deep Learning on Point Cloud Data. class label part label. Hao Su. image. May 10, 2017

CS468: 3D Deep Learning on Point Cloud Data. class label part label. Hao Su. image. May 10, 2017 CS468: 3D Deep Learning on Point Cloud Data class label part label Hao Su image. May 10, 2017 Agenda Point cloud generation Point cloud analysis CVPR 17, Point Set Generation Pipeline render CVPR 17, Point

More information

Improving Face Recognition by Exploring Local Features with Visual Attention

Improving Face Recognition by Exploring Local Features with Visual Attention Improving Face Recognition by Exploring Local Features with Visual Attention Yichun Shi and Anil K. Jain Michigan State University Difficulties of Face Recognition Large variations in unconstrained face

More information

Training models for road scene understanding with automated ground truth Dan Levi

Training models for road scene understanding with automated ground truth Dan Levi Training models for road scene understanding with automated ground truth Dan Levi With: Noa Garnett, Ethan Fetaya, Shai Silberstein, Rafi Cohen, Shaul Oron, Uri Verner, Ariel Ayash, Kobi Horn, Vlad Golder,

More information

DeepIndex for Accurate and Efficient Image Retrieval

DeepIndex for Accurate and Efficient Image Retrieval DeepIndex for Accurate and Efficient Image Retrieval Yu Liu, Yanming Guo, Song Wu, Michael S. Lew Media Lab, Leiden Institute of Advance Computer Science Outline Motivation Proposed Approach Results Conclusions

More information

Image Transformation via Neural Network Inversion

Image Transformation via Neural Network Inversion Image Transformation via Neural Network Inversion Asha Anoosheh Rishi Kapadia Jared Rulison Abstract While prior experiments have shown it is possible to approximately reconstruct inputs to a neural net

More information

An Exploration of Computer Vision Techniques for Bird Species Classification

An Exploration of Computer Vision Techniques for Bird Species Classification An Exploration of Computer Vision Techniques for Bird Species Classification Anne L. Alter, Karen M. Wang December 15, 2017 Abstract Bird classification, a fine-grained categorization task, is a complex

More information