Regression in Deep Learning: Siamese and Triplet Networks

Size: px
Start display at page:

Download "Regression in Deep Learning: Siamese and Triplet Networks"

Transcription

1 Regression in Deep Learning: Siamese and Triplet Networks Tu Bui, John Collomosse Centre for Vision, Speech and Signal Processing (CVSSP) University of Surrey, United Kingdom Leonardo Ribeiro, Tiago Nazare, Moacir Ponti Institute of Mathematics and Computer Sciences (ICMC) University of Sao Paulo, Brazil

2 Content The regression problem Siamese network and contrastive loss Triplet network and triplet loss Training tricks Regression application: sketch-based image retrieval Limitations and future work 2

3 top-5 error (%) Lower is better Revolution of deep learning in classification ImageNet ILSVRC winner shallow Human 6% AlexNet 15.3 ZFNet GoogleNet ResNet 2.99 Ensemble year 2.25 SENet 3

4 Classification vs. Regression Classification - Discrete set of outputs - Output: label/class/category Regression - Continuous valued output - Output: embedding feature x n 0 x 4 x 3 x2 x 1 4

5 Regression example: intra-domain learning Face identification Tracking Schroff et al. CVPR 2015 Wang & Gupta ICCV

6 Regression example: cross-domain learning Multi-modality visual search duck Language model 3D model photo model sketch model Skip-gram voxnet AlexNet SketchANet Embedding space 6

7 Conventional methods for cross-domain regression Step 1 Step 2 Source data SIFT, HoG, SURF Local features BoW, GMM Global features Learnable transform matrix *M transformed features Target data Local features Global features Embedding space Problem: assume linear transformation between two domains. 7

8 End-to-end regression with deep learning End-to-end learning Source data Layer 1 Layer 2 Layer n target data Embedding space Layer 1 Layer 2 Layer m Multi-stream network 8

9 End-to-end regression with multi-stream networks Open questions: Network designs? Loss function to be used? 9

10 Using output of classification model as feature? - Not intuitive: different objective function - Cross-domain learning: training a classification network for each domain separately does not guarantee a common embedding. softmax loss fc6 fc7 softmax loss fc6 fc7 10

11 Content The regression problem Siamese network and contrastive loss Triplet network and triplet loss Training tricks Regression application: sketch-based image retrieval Limitations and future work 11

12 Siamese network and contrastive loss - Siamese (2-branch) network x 1 x 2 - Given an input training pair (x 1,x 2 ): o Label: o y = ቊ 0 if x 1, x 2 similar pair 1 if x 1, x 2 dissimilar pair Network output: f W1 g W2 a = f W 1, x 1 p = g W 2, x 2 o Euclidean distance between outputs: D W 1, W 2, x 1, x 2 = a p 2 = f W 1, x 1 g W 2, x 2 2 a=f(w 1,x 1 ) p=g(w 2,x 2 ) L(a,p) 12

13 Siamese network and contrastive loss - Contrastive loss equation: x 1 x 2 L W 1, W 2, x 1, x 2 = y D y max 0, m D 2 D = a p 2 = f W 1, x 1 g W 2, x 2 2 y = ቊ 0 if x 1, x 2 similar pair 1 if x 1, x 2 dissimilar pair f W1 g W2 margin m: desirable distance for dissimilar pair (x 1,x 2 ) - Training: argmin W 1,W 2 L a=f(w 1,x 1 ) p=g(w 2,x 2 ) L(a,p) 13

14 Siamese network and contrastive loss Contrastive loss functions: - Standard form* L y=0 L(a, p) = y D y {max 0, m D) 2 L y=1 - Alternative form** L y=0 L a, p = y D y {max(0, m D2 )} L y=1 *Hadsell et al. CVPR 2006 **Chopra et al. CVPR

15 Content The regression problem Siamese network and contrastive loss Triplet network and triplet loss Training tricks Regression application: sketch-based image retrieval Limitations and future work 15

16 Triplet network and triplet loss - Triplet (3-branch) network x a x p x n o Given a training triplet (x a,x p,x n ): x a anchor; x p positive (similar to x a ); x n negative (dissimilar to x a ) o Pos/neg branches always share weights. f W1 g W2 g W2 o Anchor branch can share weights (intra-domain learning) or not (cross-domain learning). o Network outputs: a = f W 1, x a p = g W 2, x p n = g(w 2, x n ) a=f(w 1,x a ) p=g(w 2,x p ) L(a, p, n) n=g(w 2,x n ) 16

17 Triplet network and triplet loss Triplet loss equation: x a x p x n L a, p, n = 1 2 {max(0, m + D2 (a, p) D 2 (a, n)} o Standard form*: f W1 g W2 g W2 D u, v = u v 2 o Alternative form**: D u, v = 1 *Schroff et al. CVPR 2015 **Wang et al. ICCV 2015 u. v u 2 v 2 a=f(w 1,x a ) p=g(w 2,x p ) L(a, p, n) n=g(w 2,x n ) 17

18 Siamese vs. Triplet n m n n a p a p a p Before training Contrastive loss m Triplet loss L a, p = 1 2 (1 y) a p y {max(0, m a p 2 2 } L a, p, n = 1 2 {max(0, m + + a p 2 2 a n 2 2 } 18

19 Siamese or triplet? Depending on data, training strategies, network design and more: - Siamese superior Radenovie et al. ECCV Triplet superior o Hoffer & Ailon. SBPR o Bui et al. arxiv

20 Content The regression problem Siamese network and contrastive loss Triplet network and triplet loss Training tricks Regression application: sketch-based image retrieval Limitations and future work 20

21 Training trick #1: solving gradient collapsing problem - The gradient collapsing problem N L = 1 2N {max 0, m + a i p 2 i 2 a i n 2 i 2 } i=1 Margin m = 1.0 a m n p expected n a p reality 21

22 Training tricks #1 - Solution for gradient collapsing: Combine regression and classification loss for better regularisation. Change loss function. N L = 1 2N {max 0, m + ka i p 2 i 2 ka i n 2 i 2 } i=1 L(a,p,n) Saddle point L(a,p,n ) p a p, n n p a p, n n 22

23 Training tricks #2: dimensional reduction - Conventional methods: o o Redundant analysis on a fixed set of features. E.g. Principal Component Analysis (PCA), Product quantisation, etc - Dimensional reduction in CNN: part of the training process 4096x1x1 FC7 128x4096x1x x1x1 = 128x1x1 Conv filter (fc) bias out FC out

24 Training tricks #3: hard negative mining Random paring Positive and negative samples are selected randomly. Hard negative mining Negative example is the nearest irrelevant neighbor to the anchor. Hard positive mining Positive example is the farthest relevant neighbor to the anchor. + + duck photo + + duck photo + + duck photo duck 3D swan photo cat photo duck 3D duck 3D cat photo 24

25 Training tricks #4: layer sharing - Consider sharing the anchor with the pos/neg branches a p a p a p Full-share No-share Partial-share 25

26 Other training tricks - Data augmentation: o Random crop, rotation, scaling, flip, whitening - Dropout: o Randomly disable neurons - Regularisation: o Add parameter magnitude to loss o L total (W, X) = L contrastive,triplet (W, X) + W 2 26

27 Content The regression problem Siamese network and contrastive loss Triplet network and triplet loss Training tricks Regression application: sketch-based image retrieval Limitations and future work 27

28 Regression application: sketch-based image retrieval (SBIR) Search for a particular image in your mind? 28

29 Text search? 29

30 Sketch-based Image Retrieval (SBIR) sketch retrieval 30

31 Existing applications Google Emoji Search Detexify: latex symbol search 31

32 Challenges Free-hand sketch is usually messy. Horse category Flickr-330 dataset, Hu et al

33 Challenges Various levels of abstraction. House Crocodile TU-Berlin dataset, Eitz et al

34 Challenges Domain gap: sketch does not always describe real-life object accurately. Caricature Anthropomorphism Cat s whisker Hedgehog s spine Smiling spider? Simplification Viewpoint Category person walking TU-Berlin 34

35 Challenges Limited #sketch datasets. Flickr15K: 330 sketches + 15k classes TU-Berlin: 20k sketches@250 classes o New Google Quickdraw: 50M classes Sketchy: ~75k sketches k classes Flickr15K [Hu et al. 2013] TU-Berlin [Eitz et al. 2012] Sketchy [Sangkloy et al. 2016] 35

36 SBIR evaluation metric - Evaluation metric o Mean Average Precision (map) o Precision-recal (PR) curve P k = # relevant in top k results k AP = σ k=1 N P k rel(k) # relevant images o Kendal rank correlation coefficient map = 1 Q q Q AP q 36

37 Background Conventional shallow SBIR framework Edge extraction Feature extraction # 1 #2 # 3 Photo database Edge map # N Index file matching Query

38 Background: hand-crafted features Structure tensor [Eitz,2010] Flickr15K benchmark Method Structure Tensor [Eitz, 2010] map(%) 7.98 W 1 S W I W 2 I x 2 2 I x y 2 I x y 2 I y 2 dictionary 38

39 Background: hand-crafted features Flickr15K benchmark Shape context [Mori, 2005] Method map(%) Structure Tensor [Eitz, 2010] 7.98 Shape Context [Mori, 2005]

40 Background: hand-crafted features Flickr15K benchmark Self similarity [Shechtman, 2007] Method map(%) Structure Tensor [Eitz, 2010] 7.98 Shape Context [Mori, 2005] 8.14 SSIM [Shechtman, 2007] 9.57 corr, 40

41 Background: hand-crafted features Flickr15K benchmark SIFT [Lowe, 2004] HoG [Dalas, 2005] Method map(%) Structure Tensor [Eitz, 2010] 7.98 Shape Context [Mori, 2005] 8.14 SSIM [Shechtman, 2007] 9.57 SIFT [Lowe, 2004] 9.11 HoG [Dalas, 2005] SIFT HoG 41

42 Background: hand-crafted features Flickr15K benchmark GF-HoG [Hu et al. CVIU 2013] Color GF-HoG [Bui et al. ICCV 2015] Method map(%) Structure Tensor [Eitz, 2010] 7.98 Shape Context [Mori, 2005] 8.14 SSIM [Shechtman, 2007] 9.57 SIFT [Lowe, 2004] 9.11 HoG [Dalas, 2005] GF-HoG [Hu, 2013] Color GF-HoG [Bui, 2015]

43 Background: hand-crafted features Flickr15K benchmark PerceptualEdge [Qi, 2015] Method map(%) Structure Tensor [Eitz, 2010] 7.98 Shape Context [Mori, 2005] 8.14 gpb Perceptual edge SSIM [Shechtman, 2007] 9.57 SIFT [Lowe, 2004] 9.11 HoG [Dalas, 2005] GF-HoG [Hu, 2013] Color GF-HoG [Bui, 2015] PerceptualEdge [Qi, 2015]

44 Back ground: deep features Flickr15K benchmark - Siamese network with contrastive loss - Qi et al. ICIP 2016 o Sketch-edgemap Method map(%) Structure Tensor [Eitz, 2010] 7.98 Shape Context [Mori, 2005] 8.14 SSIM [Shechtman, 2007] 9.57 SIFT [Lowe, 2004] 9.11 HoG [Dalas, 2005] o Fully shared GF-HoG [Hu, 2013] Color GF-HoG [Bui, 2015] PerceptualEdge [Qi, 2015] Siamese network [Qi, 2016]

45 Triplet network for SBIR Sketch-edgemap CNN architecture: Sketch-A-Net [Yu, 2015] C1 C2 Output dimension: 100 C Share layers: Conv 4-5, FC 6-8 C4 C Loss: fc6 fc7 N L = 1 2N {max 0, m + ka i p 2 i 2 ka i n 2 i 2 } i=1 k = 2.0 fc8 a p n 45

46 Training procedure Images: 25k photos: 100 photos/class. Edge extraction: gpb [Arbelaez, 2011]. Mean subtraction, random crop/rotation/scaling/flip. Sketches: 20k sketches: 20s training, 60s validation per class. Skeletonisation. Mean subtraction, random crop/rotation/scaling/flip. Random stroke removal. Triplet formation: Random selection pos/neg samples. Training: 10k epochs. Multistep decreasing learning rate k = crop rotation scaling flip Stroke removal 46

47 Results Flickr15K benchmark Method map(%) Structure Tensor [Eitz, 2010] 7.98 Shape Context [Mori, 2005] 8.14 SSIM [Shechtman, 2007] 9.57 SIFT [Lowe, 2004] 9.11 HoG [Dalas, 2005] GF-HoG [Hu, 2013] Colour GF-HoG [Bui, 2015] PerceptualEdge [Qi, 2015] Single CNN Siamese network [Qi, 2016] Triplet full-share [Bui, 2016] Triplet no-share [Bui, 2016] Triplet half-share [Bui, 2016]

48 Sketch-photo direct matching loss Training failure epochs a p n 48

49 Sketch-photo direct matching SketchANet hybrid AlexNet AlexNet loss weight x1.0 x2.0 softmax loss softmax loss triplet loss softmax loss special layers dimensional reduction normalisation 49

50 Multi-stage training procedure Stage 1: train unshared layers Train Sketch branch from scratch. Finetune image branch from AlexNet Stage 2: train shared layers Form a 2-branch network with pretrained weights. Freeze unshared layers. Train the shared layers with contrastive loss + softmax loss. Stage 3: regression with triplet loss Form a triplet network. Unfreeze the all layers. Train the whole network with triplet loss + softmax loss. Softmax loss Softmax loss Softmax loss contrastive Triplet loss loss 50

51 Training results Phase 1 Sketch branch Image branch Phase 2 Phase 3 Siamese network Triplet network 51

52 Results Flickr15K benchmark Method map(%) Structure Tensor [Eitz, 2010] 7.98 Shape Context [Mori, 2005] 8.14 SSIM [Shechtman, 2007] 9.57 SIFT [Lowe, 2004] 9.11 HoG [Dalas, 2005] GF-HoG [Hu, 2013] Colour GF-HoG [Bui, 2015] PerceptualEdge [Qi, 2015] Single CNN Siamese network [Qi, 2016] Sketch-edgemap triplet [Bui, 2016] Sketch-photo triplet

53 Layer visualisation 64 15x15 filters in conv1 layer SketchANet 96 11x11 filters in conv1 layer AlexNet 53

54 SBIR example 54

55 Demo: SketchSearch Sketch-based Image Retrieval Sketch Retrieval 55

56 Content The regression problem Siamese network and contrastive loss Triplet network and triplet loss Training tricks Regression application: sketch-based image retrieval Limitations and future work 56

57 Limitations o o o Hard to train a regression model. Need labelled datasets. Real-life sketch can be very complicated Guernica by Pablo Picasso,

58 Future work o Multi-domain regression e.g. 3D, text, photo, sketch, depth-map, cartoon duck Castrejon, 2016 Language model 3D model Photo model Sketch model Siddiquie, 2014 Embedding space o Toward unsupervised deep learning: Labelled image set, unlabelled or no sketch set Radenovic, 2017 Completely unsupervised: Auto-encoder, Generative Adversaries Network (GAN) 58

59 Thank you for listening 59

SKETCH-BASED IMAGE RETRIEVAL VIA SIAMESE CONVOLUTIONAL NEURAL NETWORK

SKETCH-BASED IMAGE RETRIEVAL VIA SIAMESE CONVOLUTIONAL NEURAL NETWORK SKETCH-BASED IMAGE RETRIEVAL VIA SIAMESE CONVOLUTIONAL NEURAL NETWORK Yonggang Qi Yi-Zhe Song Honggang Zhang Jun Liu School of Information and Communication Engineering, BUPT, Beijing, China School of

More information

Compact Descriptors for Sketch-based Image Retrieval using a Triplet loss Convolutional Neural Network

Compact Descriptors for Sketch-based Image Retrieval using a Triplet loss Convolutional Neural Network Compact Descriptors for Sketch-based Image Retrieval using a Triplet loss Convolutional Neural Network T. Bui 1, L. Ribeiro 2, M. Ponti 2, John Collomosse 1 1 Centre for Vision, Speech and Signal Processing

More information

Como funciona o Deep Learning

Como funciona o Deep Learning Como funciona o Deep Learning Moacir Ponti (com ajuda de Gabriel Paranhos da Costa) ICMC, Universidade de São Paulo Contact: www.icmc.usp.br/~moacir moacir@icmc.usp.br Uberlandia-MG/Brazil October, 2017

More information

Spatial Localization and Detection. Lecture 8-1

Spatial Localization and Detection. Lecture 8-1 Lecture 8: Spatial Localization and Detection Lecture 8-1 Administrative - Project Proposals were due on Saturday Homework 2 due Friday 2/5 Homework 1 grades out this week Midterm will be in-class on Wednesday

More information

Deep Learning for Computer Vision II

Deep Learning for Computer Vision II IIIT Hyderabad Deep Learning for Computer Vision II C. V. Jawahar Paradigm Shift Feature Extraction (SIFT, HoG, ) Part Models / Encoding Classifier Sparrow Feature Learning Classifier Sparrow L 1 L 2 L

More information

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU, Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image

More information

ECCV Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016

ECCV Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016 ECCV 2016 Presented by: Boris Ivanovic and Yolanda Wang CS 331B - November 16, 2016 Fundamental Question What is a good vector representation of an object? Something that can be easily predicted from 2D

More information

Deep Learning for Computer Vision with MATLAB By Jon Cherrie

Deep Learning for Computer Vision with MATLAB By Jon Cherrie Deep Learning for Computer Vision with MATLAB By Jon Cherrie 2015 The MathWorks, Inc. 1 Deep learning is getting a lot of attention "Dahl and his colleagues won $22,000 with a deeplearning system. 'We

More information

Everything you wanted to know about Deep Learning for Computer Vision but were afraid to ask

Everything you wanted to know about Deep Learning for Computer Vision but were afraid to ask Everything you wanted to know about Deep Learning for Computer Vision but were afraid to ask Moacir A. Ponti, Leonardo S. F. Ribeiro, Tiago S. Nazare ICMC University of São Paulo São Carlos/SP, 13566-590,

More information

Deep Residual Learning

Deep Residual Learning Deep Residual Learning MSRA @ ILSVRC & COCO 2015 competitions Kaiming He with Xiangyu Zhang, Shaoqing Ren, Jifeng Dai, & Jian Sun Microsoft Research Asia (MSRA) MSRA @ ILSVRC & COCO 2015 Competitions 1st

More information

CENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan

CENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan CENG 783 Special topics in Deep Learning AlchemyAPI Week 11 Sinan Kalkan TRAINING A CNN Fig: http://www.robots.ox.ac.uk/~vgg/practicals/cnn/ Feed-forward pass Note that this is written in terms of the

More information

Cross-domain Deep Encoding for 3D Voxels and 2D Images

Cross-domain Deep Encoding for 3D Voxels and 2D Images Cross-domain Deep Encoding for 3D Voxels and 2D Images Jingwei Ji Stanford University jingweij@stanford.edu Danyang Wang Stanford University danyangw@stanford.edu 1. Introduction 3D reconstruction is one

More information

Deep Learning with Tensorflow AlexNet

Deep Learning with Tensorflow   AlexNet Machine Learning and Computer Vision Group Deep Learning with Tensorflow http://cvml.ist.ac.at/courses/dlwt_w17/ AlexNet Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification

More information

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides Deep Learning in Visual Recognition Thanks Da Zhang for the slides Deep Learning is Everywhere 2 Roadmap Introduction Convolutional Neural Network Application Image Classification Object Detection Object

More information

Return of the Devil in the Details: Delving Deep into Convolutional Nets

Return of the Devil in the Details: Delving Deep into Convolutional Nets Return of the Devil in the Details: Delving Deep into Convolutional Nets Ken Chatfield - Karen Simonyan - Andrea Vedaldi - Andrew Zisserman University of Oxford The Devil is still in the Details 2011 2014

More information

Deep Face Recognition. Nathan Sun

Deep Face Recognition. Nathan Sun Deep Face Recognition Nathan Sun Why Facial Recognition? Picture ID or video tracking Higher Security for Facial Recognition Software Immensely useful to police in tracking suspects Your face will be an

More information

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University.

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University. Visualizing and Understanding Convolutional Networks Christopher Pennsylvania State University February 23, 2015 Some Slide Information taken from Pierre Sermanet (Google) presentation on and Computer

More information

FaceNet. Florian Schroff, Dmitry Kalenichenko, James Philbin Google Inc. Presentation by Ignacio Aranguren and Rahul Rana

FaceNet. Florian Schroff, Dmitry Kalenichenko, James Philbin Google Inc. Presentation by Ignacio Aranguren and Rahul Rana FaceNet Florian Schroff, Dmitry Kalenichenko, James Philbin Google Inc. Presentation by Ignacio Aranguren and Rahul Rana Introduction FaceNet learns a mapping from face images to a compact Euclidean Space

More information

Advanced Video Analysis & Imaging

Advanced Video Analysis & Imaging Advanced Video Analysis & Imaging (5LSH0), Module 09B Machine Learning with Convolutional Neural Networks (CNNs) - Workout Farhad G. Zanjani, Clint Sebastian, Egor Bondarev, Peter H.N. de With ( p.h.n.de.with@tue.nl

More information

POINT CLOUD DEEP LEARNING

POINT CLOUD DEEP LEARNING POINT CLOUD DEEP LEARNING Innfarn Yoo, 3/29/28 / 57 Introduction AGENDA Previous Work Method Result Conclusion 2 / 57 INTRODUCTION 3 / 57 2D OBJECT CLASSIFICATION Deep Learning for 2D Object Classification

More information

Flow-Based Video Recognition

Flow-Based Video Recognition Flow-Based Video Recognition Jifeng Dai Visual Computing Group, Microsoft Research Asia Joint work with Xizhou Zhu*, Yuwen Xiong*, Yujie Wang*, Lu Yuan and Yichen Wei (* interns) Talk pipeline Introduction

More information

SSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang

SSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang SSD: Single Shot MultiBox Detector Author: Wei Liu et al. Presenter: Siyu Jiang Outline 1. Motivations 2. Contributions 3. Methodology 4. Experiments 5. Conclusions 6. Extensions Motivation Motivation

More information

Supplementary material for Analyzing Filters Toward Efficient ConvNet

Supplementary material for Analyzing Filters Toward Efficient ConvNet Supplementary material for Analyzing Filters Toward Efficient Net Takumi Kobayashi National Institute of Advanced Industrial Science and Technology, Japan takumi.kobayashi@aist.go.jp A. Orthonormal Steerable

More information

Channel Locality Block: A Variant of Squeeze-and-Excitation

Channel Locality Block: A Variant of Squeeze-and-Excitation Channel Locality Block: A Variant of Squeeze-and-Excitation 1 st Huayu Li Northern Arizona University Flagstaff, United State Northern Arizona University hl459@nau.edu arxiv:1901.01493v1 [cs.lg] 6 Jan

More information

Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia

Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia Deep learning for dense per-pixel prediction Chunhua Shen The University of Adelaide, Australia Image understanding Classification error Convolution Neural Networks 0.3 0.2 0.1 Image Classification [Krizhevsky

More information

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:

More information

Project 3 Q&A. Jonathan Krause

Project 3 Q&A. Jonathan Krause Project 3 Q&A Jonathan Krause 1 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations 2 Outline R-CNN Review Error metrics Code Overview Project 3 Report Project 3 Presentations

More information

Study of Residual Networks for Image Recognition

Study of Residual Networks for Image Recognition Study of Residual Networks for Image Recognition Mohammad Sadegh Ebrahimi Stanford University sadegh@stanford.edu Hossein Karkeh Abadi Stanford University hosseink@stanford.edu Abstract Deep neural networks

More information

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

COMP 551 Applied Machine Learning Lecture 16: Deep Learning COMP 551 Applied Machine Learning Lecture 16: Deep Learning Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted, all

More information

Know your data - many types of networks

Know your data - many types of networks Architectures Know your data - many types of networks Fixed length representation Variable length representation Online video sequences, or samples of different sizes Images Specific architectures for

More information

R-FCN: Object Detection with Really - Friggin Convolutional Networks

R-FCN: Object Detection with Really - Friggin Convolutional Networks R-FCN: Object Detection with Really - Friggin Convolutional Networks Jifeng Dai Microsoft Research Li Yi Tsinghua Univ. Kaiming He FAIR Jian Sun Microsoft Research NIPS, 2016 Or Region-based Fully Convolutional

More information

Convolutional Neural Networks + Neural Style Transfer. Justin Johnson 2/1/2017

Convolutional Neural Networks + Neural Style Transfer. Justin Johnson 2/1/2017 Convolutional Neural Networks + Neural Style Transfer Justin Johnson 2/1/2017 Outline Convolutional Neural Networks Convolution Pooling Feature Visualization Neural Style Transfer Feature Inversion Texture

More information

Fully-Convolutional Siamese Networks for Object Tracking

Fully-Convolutional Siamese Networks for Object Tracking Fully-Convolutional Siamese Networks for Object Tracking Luca Bertinetto*, Jack Valmadre*, João Henriques, Andrea Vedaldi and Philip Torr www.robots.ox.ac.uk/~luca luca.bertinetto@eng.ox.ac.uk Tracking

More information

arxiv: v1 [cs.cv] 1 Feb 2017

arxiv: v1 [cs.cv] 1 Feb 2017 Siamese Network of Deep Fisher-Vector Descriptors for Image Retrieval arxiv:702.00338v [cs.cv] Feb 207 Abstract Eng-Jon Ong, Sameed Husain and Miroslaw Bober University of Surrey Guildford, UK This paper

More information

Inception Network Overview. David White CS793

Inception Network Overview. David White CS793 Inception Network Overview David White CS793 So, Leonardo DiCaprio dreams about dreaming... https://m.media-amazon.com/images/m/mv5bmjaxmzy3njcxnf5bml5banbnxkftztcwnti5otm0mw@@._v1_sy1000_cr0,0,675,1 000_AL_.jpg

More information

An Exploration of Computer Vision Techniques for Bird Species Classification

An Exploration of Computer Vision Techniques for Bird Species Classification An Exploration of Computer Vision Techniques for Bird Species Classification Anne L. Alter, Karen M. Wang December 15, 2017 Abstract Bird classification, a fine-grained categorization task, is a complex

More information

Sketch Based Image Retrieval Approach Using Gray Level Co-Occurrence Matrix

Sketch Based Image Retrieval Approach Using Gray Level Co-Occurrence Matrix Sketch Based Image Retrieval Approach Using Gray Level Co-Occurrence Matrix K... Nagarjuna Reddy P. Prasanna Kumari JNT University, JNT University, LIET, Himayatsagar, Hyderabad-8, LIET, Himayatsagar,

More information

Dynamic Routing Between Capsules

Dynamic Routing Between Capsules Report Explainable Machine Learning Dynamic Routing Between Capsules Author: Michael Dorkenwald Supervisor: Dr. Ullrich Köthe 28. Juni 2018 Inhaltsverzeichnis 1 Introduction 2 2 Motivation 2 3 CapusleNet

More information

Efficient Algorithms may not be those we think

Efficient Algorithms may not be those we think Efficient Algorithms may not be those we think Yann LeCun, Computational and Biological Learning Lab The Courant Institute of Mathematical Sciences New York University http://yann.lecun.com http://www.cs.nyu.edu/~yann

More information

Intro to Deep Learning. Slides Credit: Andrej Karapathy, Derek Hoiem, Marc Aurelio, Yann LeCunn

Intro to Deep Learning. Slides Credit: Andrej Karapathy, Derek Hoiem, Marc Aurelio, Yann LeCunn Intro to Deep Learning Slides Credit: Andrej Karapathy, Derek Hoiem, Marc Aurelio, Yann LeCunn Why this class? Deep Features Have been able to harness the big data in the most efficient and effective

More information

learning stage (Stage 1), CNNH learns approximate hash codes for training images by optimizing the following loss function:

learning stage (Stage 1), CNNH learns approximate hash codes for training images by optimizing the following loss function: 1 Query-adaptive Image Retrieval by Deep Weighted Hashing Jian Zhang and Yuxin Peng arxiv:1612.2541v2 [cs.cv] 9 May 217 Abstract Hashing methods have attracted much attention for large scale image retrieval.

More information

DECISION TREES & RANDOM FORESTS X CONVOLUTIONAL NEURAL NETWORKS

DECISION TREES & RANDOM FORESTS X CONVOLUTIONAL NEURAL NETWORKS DECISION TREES & RANDOM FORESTS X CONVOLUTIONAL NEURAL NETWORKS Deep Neural Decision Forests Microsoft Research Cambridge UK, ICCV 2015 Decision Forests, Convolutional Networks and the Models in-between

More information

Bilinear Models for Fine-Grained Visual Recognition

Bilinear Models for Fine-Grained Visual Recognition Bilinear Models for Fine-Grained Visual Recognition Subhransu Maji College of Information and Computer Sciences University of Massachusetts, Amherst Fine-grained visual recognition Example: distinguish

More information

Smart Content Recognition from Images Using a Mixture of Convolutional Neural Networks *

Smart Content Recognition from Images Using a Mixture of Convolutional Neural Networks * Smart Content Recognition from Images Using a Mixture of Convolutional Neural Networks * Tee Connie *, Mundher Al-Shabi *, and Michael Goh Faculty of Information Science and Technology, Multimedia University,

More information

Supervised Hashing for Image Retrieval via Image Representation Learning

Supervised Hashing for Image Retrieval via Image Representation Learning Supervised Hashing for Image Retrieval via Image Representation Learning Rongkai Xia, Yan Pan, Cong Liu (Sun Yat-Sen University) Hanjiang Lai, Shuicheng Yan (National University of Singapore) Finding Similar

More information

Unsupervised Deep Learning. James Hays slides from Carl Doersch and Richard Zhang

Unsupervised Deep Learning. James Hays slides from Carl Doersch and Richard Zhang Unsupervised Deep Learning James Hays slides from Carl Doersch and Richard Zhang Recap from Previous Lecture We saw two strategies to get structured output while using deep learning With object detection,

More information

Hide-and-Seek: Forcing a network to be Meticulous for Weakly-supervised Object and Action Localization

Hide-and-Seek: Forcing a network to be Meticulous for Weakly-supervised Object and Action Localization Hide-and-Seek: Forcing a network to be Meticulous for Weakly-supervised Object and Action Localization Krishna Kumar Singh and Yong Jae Lee University of California, Davis ---- Paper Presentation Yixian

More information

Siamese Network Features for Image Matching

Siamese Network Features for Image Matching Siamese Network Features for Image Matching Iaroslav Melekhov Department of Computer Science Aalto University, Finland Email: iaroslav.melekhov@aalto.fi Juho Kannala Department of Computer Science Aalto

More information

A Novel Representation and Pipeline for Object Detection

A Novel Representation and Pipeline for Object Detection A Novel Representation and Pipeline for Object Detection Vishakh Hegde Stanford University vishakh@stanford.edu Manik Dhar Stanford University dmanik@stanford.edu Abstract Object detection is an important

More information

arxiv: v1 [cs.cv] 29 Sep 2016

arxiv: v1 [cs.cv] 29 Sep 2016 arxiv:1609.09545v1 [cs.cv] 29 Sep 2016 Two-stage Convolutional Part Heatmap Regression for the 1st 3D Face Alignment in the Wild (3DFAW) Challenge Adrian Bulat and Georgios Tzimiropoulos Computer Vision

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

Deep Spatial-Semantic Attention for Fine-Grained Sketch-Based Image Retrieval

Deep Spatial-Semantic Attention for Fine-Grained Sketch-Based Image Retrieval Deep Spatial-Semantic Attention for Fine-Grained Sketch-Based Image Retrieval Jifei Song Qian Yu Yi-Zhe Song Tao Xiang Timothy M. Hospedales Queen Mary University of London University of Edinburgh {j.song,

More information

Fully Convolutional Network for Depth Estimation and Semantic Segmentation

Fully Convolutional Network for Depth Estimation and Semantic Segmentation Fully Convolutional Network for Depth Estimation and Semantic Segmentation Yokila Arora ICME Stanford University yarora@stanford.edu Ishan Patil Department of Electrical Engineering Stanford University

More information

Recurrent Neural Networks and Transfer Learning for Action Recognition

Recurrent Neural Networks and Transfer Learning for Action Recognition Recurrent Neural Networks and Transfer Learning for Action Recognition Andrew Giel Stanford University agiel@stanford.edu Ryan Diaz Stanford University ryandiaz@stanford.edu Abstract We have taken on the

More information

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro CMU 15-781 Lecture 18: Deep learning and Vision: Convolutional neural networks Teacher: Gianni A. Di Caro DEEP, SHALLOW, CONNECTED, SPARSE? Fully connected multi-layer feed-forward perceptrons: More powerful

More information

Apparel Classifier and Recommender using Deep Learning

Apparel Classifier and Recommender using Deep Learning Apparel Classifier and Recommender using Deep Learning Live Demo at: http://saurabhg.me/projects/tag-that-apparel Saurabh Gupta sag043@ucsd.edu Siddhartha Agarwal siagarwa@ucsd.edu Apoorve Dave a1dave@ucsd.edu

More information

SELF SUPERVISED DEEP REPRESENTATION LEARNING FOR FINE-GRAINED BODY PART RECOGNITION

SELF SUPERVISED DEEP REPRESENTATION LEARNING FOR FINE-GRAINED BODY PART RECOGNITION SELF SUPERVISED DEEP REPRESENTATION LEARNING FOR FINE-GRAINED BODY PART RECOGNITION Pengyue Zhang Fusheng Wang Yefeng Zheng Medical Imaging Technologies, Siemens Medical Solutions USA Inc., Princeton,

More information

on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015

on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015 on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015 Vector visual representation Fixed-size image representation High-dim (100 100,000) Generic, unsupervised: BoW,

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Announcements Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Seminar registration period starts on Friday We will offer a lab course in the summer semester Deep Robot Learning Topic:

More information

Object Detection Based on Deep Learning

Object Detection Based on Deep Learning Object Detection Based on Deep Learning Yurii Pashchenko AI Ukraine 2016, Kharkiv, 2016 Image classification (mostly what you ve seen) http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf

More information

Exploiting noisy web data for largescale visual recognition

Exploiting noisy web data for largescale visual recognition Exploiting noisy web data for largescale visual recognition Lamberto Ballan University of Padova, Italy CVPRW WebVision - Jul 26, 2017 Datasets drive computer vision progress ImageNet Slide credit: O.

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning Applications 11.01.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period starts

More information

LARGE-SCALE PERSON RE-IDENTIFICATION AS RETRIEVAL

LARGE-SCALE PERSON RE-IDENTIFICATION AS RETRIEVAL LARGE-SCALE PERSON RE-IDENTIFICATION AS RETRIEVAL Hantao Yao 1,2, Shiliang Zhang 3, Dongming Zhang 1, Yongdong Zhang 1,2, Jintao Li 1, Yu Wang 4, Qi Tian 5 1 Key Lab of Intelligent Information Processing

More information

RECENT years have witnessed the rapid growth of image. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval

RECENT years have witnessed the rapid growth of image. SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval SSDH: Semi-supervised Deep Hashing for Large Scale Image Retrieval Jian Zhang, Yuxin Peng, and Junchao Zhang arxiv:607.08477v [cs.cv] 28 Jul 206 Abstract The hashing methods have been widely used for efficient

More information

Rich feature hierarchies for accurate object detection and semantic segmentation

Rich feature hierarchies for accurate object detection and semantic segmentation Rich feature hierarchies for accurate object detection and semantic segmentation BY; ROSS GIRSHICK, JEFF DONAHUE, TREVOR DARRELL AND JITENDRA MALIK PRESENTER; MUHAMMAD OSAMA Object detection vs. classification

More information

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications

Learning Visual Semantics: Models, Massive Computation, and Innovative Applications Learning Visual Semantics: Models, Massive Computation, and Innovative Applications Part II: Visual Features and Representations Liangliang Cao, IBM Watson Research Center Evolvement of Visual Features

More information

Machine Learning. MGS Lecture 3: Deep Learning

Machine Learning. MGS Lecture 3: Deep Learning Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ Machine Learning MGS Lecture 3: Deep Learning Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ WHAT IS DEEP LEARNING? Shallow network: Only one hidden layer

More information

Object and Action Detection from a Single Example

Object and Action Detection from a Single Example Object and Action Detection from a Single Example Peyman Milanfar* EE Department University of California, Santa Cruz *Joint work with Hae Jong Seo AFOSR Program Review, June 4-5, 29 Take a look at this:

More information

Facial Expression Classification with Random Filters Feature Extraction

Facial Expression Classification with Random Filters Feature Extraction Facial Expression Classification with Random Filters Feature Extraction Mengye Ren Facial Monkey mren@cs.toronto.edu Zhi Hao Luo It s Me lzh@cs.toronto.edu I. ABSTRACT In our work, we attempted to tackle

More information

Generative Modeling with Convolutional Neural Networks. Denis Dus Data Scientist at InData Labs

Generative Modeling with Convolutional Neural Networks. Denis Dus Data Scientist at InData Labs Generative Modeling with Convolutional Neural Networks Denis Dus Data Scientist at InData Labs What we will discuss 1. 2. 3. 4. Discriminative vs Generative modeling Convolutional Neural Networks How to

More information

Deep learning for object detection. Slides from Svetlana Lazebnik and many others

Deep learning for object detection. Slides from Svetlana Lazebnik and many others Deep learning for object detection Slides from Svetlana Lazebnik and many others Recent developments in object detection 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before deep

More information

Three things everyone should know to improve object retrieval. Relja Arandjelović and Andrew Zisserman (CVPR 2012)

Three things everyone should know to improve object retrieval. Relja Arandjelović and Andrew Zisserman (CVPR 2012) Three things everyone should know to improve object retrieval Relja Arandjelović and Andrew Zisserman (CVPR 2012) University of Oxford 2 nd April 2012 Large scale object retrieval Find all instances of

More information

arxiv: v2 [cs.cv] 22 Mar 2017

arxiv: v2 [cs.cv] 22 Mar 2017 Localizing and Orienting Street Views Using Overhead Imagery Nam N. Vo and James Hays Georgia Institute of Technology {namvo,hays}@gatech.edu arxiv:1608.00161v2 [cs.cv] 22 Mar 2017 Abstract. In this paper

More information

Announcements. Recognition. Recognition. Recognition. Recognition. Homework 3 is due May 18, 11:59 PM Reading: Computer Vision I CSE 152 Lecture 14

Announcements. Recognition. Recognition. Recognition. Recognition. Homework 3 is due May 18, 11:59 PM Reading: Computer Vision I CSE 152 Lecture 14 Announcements Computer Vision I CSE 152 Lecture 14 Homework 3 is due May 18, 11:59 PM Reading: Chapter 15: Learning to Classify Chapter 16: Classifying Images Chapter 17: Detecting Objects in Images Given

More information

Yelp Restaurant Photo Classification

Yelp Restaurant Photo Classification Yelp Restaurant Photo Classification Rajarshi Roy Stanford University rroy@stanford.edu Abstract The Yelp Restaurant Photo Classification challenge is a Kaggle challenge that focuses on the problem predicting

More information

Characterization and Benchmarking of Deep Learning. Natalia Vassilieva, PhD Sr. Research Manager

Characterization and Benchmarking of Deep Learning. Natalia Vassilieva, PhD Sr. Research Manager Characterization and Benchmarking of Deep Learning Natalia Vassilieva, PhD Sr. Research Manager Deep learning applications Vision Speech Text Other Search & information extraction Security/Video surveillance

More information

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning ECE 5424: Introduction to Machine Learning Topics: Unsupervised Learning: Kmeans, GMM, EM Readings: Barber 20.1-20.3 Stefan Lee Virginia Tech Tasks Supervised Learning x Classification y Discrete x Regression

More information

arxiv: v1 [cs.cv] 16 Mar 2017

arxiv: v1 [cs.cv] 16 Mar 2017 Deep Sketch Hashing: Fast Free-hand Sketch-Based Image Retrieval Li Liu, Fumin Shen 2, Yuming Shen, Xianglong Liu 3, and Ling Shao arxiv:73.565v [cs.cv] 6 Mar 27 School of Computing Science, University

More information

Martian lava field, NASA, Wikipedia

Martian lava field, NASA, Wikipedia Martian lava field, NASA, Wikipedia Old Man of the Mountain, Franconia, New Hampshire Pareidolia http://smrt.ccel.ca/203/2/6/pareidolia/ Reddit for more : ) https://www.reddit.com/r/pareidolia/top/ Pareidolia

More information

Unsupervised Learning of Spatiotemporally Coherent Metrics

Unsupervised Learning of Spatiotemporally Coherent Metrics Unsupervised Learning of Spatiotemporally Coherent Metrics Ross Goroshin, Joan Bruna, Jonathan Tompson, David Eigen, Yann LeCun arxiv 2015. Presented by Jackie Chu Contributions Insight between slow feature

More information

Towards Weakly- and Semi- Supervised Object Localization and Semantic Segmentation

Towards Weakly- and Semi- Supervised Object Localization and Semantic Segmentation Towards Weakly- and Semi- Supervised Object Localization and Semantic Segmentation Lecturer: Yunchao Wei Image Formation and Processing (IFP) Group University of Illinois at Urbanahttps://weiyc.githu Champaign

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period

More information

Is Bigger CNN Better? Samer Hijazi on behalf of IPG CTO Group Embedded Neural Networks Summit (enns2016) San Jose Feb. 9th

Is Bigger CNN Better? Samer Hijazi on behalf of IPG CTO Group Embedded Neural Networks Summit (enns2016) San Jose Feb. 9th Is Bigger CNN Better? Samer Hijazi on behalf of IPG CTO Group Embedded Neural Networks Summit (enns2016) San Jose Feb. 9th Today s Story Why does CNN matter to the embedded world? How to enable CNN in

More information

2028 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 26, NO. 4, APRIL 2017

2028 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 26, NO. 4, APRIL 2017 2028 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 26, NO. 4, APRIL 2017 Weakly Supervised PatchNets: Describing and Aggregating Local Patches for Scene Recognition Zhe Wang, Limin Wang, Yali Wang, Bowen

More information

Face Recognition A Deep Learning Approach

Face Recognition A Deep Learning Approach Face Recognition A Deep Learning Approach Lihi Shiloh Tal Perl Deep Learning Seminar 2 Outline What about Cat recognition? Classical face recognition Modern face recognition DeepFace FaceNet Comparison

More information

Structured Prediction using Convolutional Neural Networks

Structured Prediction using Convolutional Neural Networks Overview Structured Prediction using Convolutional Neural Networks Bohyung Han bhhan@postech.ac.kr Computer Vision Lab. Convolutional Neural Networks (CNNs) Structured predictions for low level computer

More information

Fei-Fei Li & Justin Johnson & Serena Yeung

Fei-Fei Li & Justin Johnson & Serena Yeung Lecture 9-1 Administrative A2 due Wed May 2 Midterm: In-class Tue May 8. Covers material through Lecture 10 (Thu May 3). Sample midterm released on piazza. Midterm review session: Fri May 4 discussion

More information

DeepIndex for Accurate and Efficient Image Retrieval

DeepIndex for Accurate and Efficient Image Retrieval DeepIndex for Accurate and Efficient Image Retrieval Yu Liu, Yanming Guo, Song Wu, Michael S. Lew Media Lab, Leiden Institute of Advance Computer Science Outline Motivation Proposed Approach Results Conclusions

More information

Inception and Residual Networks. Hantao Zhang. Deep Learning with Python.

Inception and Residual Networks. Hantao Zhang. Deep Learning with Python. Inception and Residual Networks Hantao Zhang Deep Learning with Python https://en.wikipedia.org/wiki/residual_neural_network Deep Neural Network Progress from Large Scale Visual Recognition Challenge (ILSVRC)

More information

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 CS 2750: Machine Learning Neural Networks Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 Plan for today Neural network definition and examples Training neural networks (backprop) Convolutional

More information

HENet: A Highly Efficient Convolutional Neural. Networks Optimized for Accuracy, Speed and Storage

HENet: A Highly Efficient Convolutional Neural. Networks Optimized for Accuracy, Speed and Storage HENet: A Highly Efficient Convolutional Neural Networks Optimized for Accuracy, Speed and Storage Qiuyu Zhu Shanghai University zhuqiuyu@staff.shu.edu.cn Ruixin Zhang Shanghai University chriszhang96@shu.edu.cn

More information

Fuzzy Set Theory in Computer Vision: Example 3

Fuzzy Set Theory in Computer Vision: Example 3 Fuzzy Set Theory in Computer Vision: Example 3 Derek T. Anderson and James M. Keller FUZZ-IEEE, July 2017 Overview Purpose of these slides are to make you aware of a few of the different CNN architectures

More information

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta Encoder-Decoder Networks for Semantic Segmentation Sachin Mehta Outline > Overview of Semantic Segmentation > Encoder-Decoder Networks > Results What is Semantic Segmentation? Input: RGB Image Output:

More information

arxiv: v1 [cs.cv] 30 Jul 2016

arxiv: v1 [cs.cv] 30 Jul 2016 Localizing and Orienting Street Views Using Overhead Imagery Nam N. Vo and James Hays Georgia Institute of Technology {namvo,hays}@gatech.edu arxiv:1608.00161v1 [cs.cv] 30 Jul 2016 Abstract. In this paper

More information

Deconvolutions in Convolutional Neural Networks

Deconvolutions in Convolutional Neural Networks Overview Deconvolutions in Convolutional Neural Networks Bohyung Han bhhan@postech.ac.kr Computer Vision Lab. Convolutional Neural Networks (CNNs) Deconvolutions in CNNs Applications Network visualization

More information

Su et al. Shape Descriptors - III

Su et al. Shape Descriptors - III Su et al. Shape Descriptors - III Siddhartha Chaudhuri http://www.cse.iitb.ac.in/~cs749 Funkhouser; Feng, Liu, Gong Recap Global A shape descriptor is a set of numbers that describes a shape in a way that

More information

Applying Supervised Learning

Applying Supervised Learning Applying Supervised Learning When to Consider Supervised Learning A supervised learning algorithm takes a known set of input data (the training set) and known responses to the data (output), and trains

More information

Sketching with Style: Visual Search with Sketches and Aesthetic Context

Sketching with Style: Visual Search with Sketches and Aesthetic Context Sketching with Style: Visual Search with Sketches and Aesthetic Context John Collomosse 1,2 Tu Bui 1 Michael Wilber 3 Chen Fang 2 Hailin Jin 2 1 CVSSP, University of Surrey 2 Adobe Research 3 Cornell Tech

More information

YOLO9000: Better, Faster, Stronger

YOLO9000: Better, Faster, Stronger YOLO9000: Better, Faster, Stronger Date: January 24, 2018 Prepared by Haris Khan (University of Toronto) Haris Khan CSC2548: Machine Learning in Computer Vision 1 Overview 1. Motivation for one-shot object

More information

CS 6501: Deep Learning for Computer Graphics. Training Neural Networks II. Connelly Barnes

CS 6501: Deep Learning for Computer Graphics. Training Neural Networks II. Connelly Barnes CS 6501: Deep Learning for Computer Graphics Training Neural Networks II Connelly Barnes Overview Preprocessing Initialization Vanishing/exploding gradients problem Batch normalization Dropout Additional

More information