Stereo Matching, Optical Flow, Filling the Gaps and more
|
|
- Kelly Simon
- 6 years ago
- Views:
Transcription
1 Stereo Matching, Optical Flow, Filling the Gaps and more Prof. Lior Wolf The School of Computer Science, Tel-Aviv University ICRI-CI 2017 Retreat, May 9, 2017
2 Since last year, ICRI-CI supported projects of 8 students! Shay Zweig, L. Wolf. InterpoNet, a Brain Inspired Neural Network for Optical Flow Dense Interpolation. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Aviv Eisenschtat, L. Wolf. Linking Image and Text with 2-Way Nets. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Amit Shaked, L. Wolf. Improved Stereo Matching with Constant Highway Networks and Reflective Confidence Learning. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). Tal Schuster, L. Wolf, David Gadot. Optical Flow Requires Multiple Strategies (But Only One Network). IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Dotan Kaufman, Gil Levi, Tal Hassner, L. Wolf. Temporal Tessellation: A Unified Approach for Video Analysis. In submission. Ofir Press, L. Wolf. Using the Output Embedding to Improve Language Models. European Chapter of the Association for Computational Linguistics (EACL). Short paper, 2017.
3 Optical flow The problem: Estimating a dense correspondence field between two images - usually consecutive video frames.
4 Optical flow modern pipeline A two stage method: Sparse matching Dense interpolation
5 PatchBatch - Overall Pipeline
6 PatchBatch Minor Improvements Hinge loss instead of DRLIM Keeping the additional SD component
7 Optical Flow as a Multifaceted Problem Methods keep Failing on large displacements MPI-Sintel top results KITTI 2015 average error: Foreground % Background % Possible causes: Matching algorithm Descriptor quality PatchBatch on KITTI 2012 distance between true matches
8 Distractors by displacement #distractors how many pixels have more similar descriptors in a 25px radius Increase with displacement range Goal: improve results for large displacements without reducing for other ranges.
9 Distractors by displacement Goal: improve results for large displacements without reducing for other ranges. Is it possible? Expert models Training only on sub ranges Improving results for large displacements is possible Implies the need for different features for different descriptors
10 Need for variant extracting strategies Large motions are mostly correlated with more changes in appearance: 1. Background changes 2. View point changes -> occluded parts 3. Distance and angle to light source -> illumination 4. Scale (when moving along the Z-axis)
11 Learning for Multiple Strategies and Varying Difficulty Deal with varying difficulty Curriculum (Bengio et al.): Samples are pre-ordered Curriculum by displacement Curriculum by distance (of false sample) Self-Paced (Kumar et al.): No need to pre-order Sample hardness increases with time (by loss value) Hard Sample Mining (Simo-Serra et al.): Backpropagate only some ratio of harder samples Used for training local descriptors with triplets
12 Interleaving Learning Goal: Deal with multiple sub-tasks Classification: Painting to Artist Motivated by psychological research (Kornell and Bjork) Blocking (Massing) vs. Spacing (Interleaving) <<Unintuitive! and goes against previous work>> Experiments on classification tasks, sports, etc. Learning ML models Usually random Applying gradual methods can effect randomness Learning Concepts and Categories Kornell and Bjork (2008)
13 Interleaving Learning for Optical Flow Controlling the negative sample to balance difficulty
14 Interleaving Learning for Optical Flow
15 Self-Paced Curriculum Interleaving Learning (SPCI) l i - validation loss on epoch i l init - initial loss value (epoch #5) m total epoch amount
16 MNIST L= 0..4, H = 5..9 Random noise on top half of H and bottom of L Images from H were rotated by a an angle of [0,45] with correlation to noise amount A general learning paradigm
17 KITTI2012 KITTI2015 MPI-Sintel SOTA on KITTI, what about SPI-Sintel?
18 Optical flow modern pipeline A two stage method: Sparse matching Dense interpolation Sparse to dense interpolation: EpicFlow Edge preserving interpolation (Revaud et al. 2015)
19 Research goal Construct a CNN based solution for sparse to dense OF interpolation. Motivation: Allows more flexibility and increases performance. faster runtime.
20 Interpolation in the brain an inspiration Perceptual filling-in: Neuronal filling-in: (Zweig et al. 2015) Spatial propagation top down and lateral connections. (Zurawel et al. 2014, Zweig et al. 2015, Huang et al. 2008, Poort et al. 2012) Edges as a barrier (von der Heydt et al. 2003) Multilayer process (Poort et al. 2012, Meng et al. 2005)
21 Interponet architecture overview A Fully convolutional network with no pooling.
22 Interponet Input The edges input boost the network performance, it uses them as a boundary for propagation.
23 Interponet main branch 10-7x7 convolutions No pooling elu (Clevert et al. 2015) non-linearity
24 Standard EPE Loss: Loss function
25 Lateral dependency loss Main concept include the local context in the training process: Encourages smoothness and edginess
26 Detour networks and multi-layer loss Supervision at each layer
27 Detour networks and multi-layer loss
28 Benchmark results SOTA on Sintel and KITTI Improving over EpicFlow for all the underlying matching algorithms we checked (4 of the leading algorithms)
29 The network learned to interpolate in a similar manner to the visual system
30 Stereo Matching 3D scene reconstruction Robotics Autonomous cars Augmented reality Major challenges Occlusions Highly reflective regions Sparse texture regions Repetitive patters
31 Previous work [Zbontar and LeCun 15] Employ CNN to compute the matching cost for each possible disparity
32 Previous work [Zbontar and LeCun 15] Employ CNN to compute the matching cost for each possible disparity Apply Cost aggregation and smoothness constrains Use Winner takes all rule to compute the disparity image Refine the obtained image height width
33 Research questions Motivation Research question Our solution - Using color information does not improve the quality of the disparity maps. - Adding more layers does not help. - Stacking residual layers does not converge to a meaningful solution. Design a residual architecture that is more suitable for metric learning vs. multiclass classification. - Poor results on reflective and occluded regions. Providing a solution for reflective and occluded regions. - Occluded pixels and mismatch predictions are still common. - They mostly can be assed from their neighbors. How to measure certainty of classifiers? Multilevel Constant Highway Network. Apply a learned criterion and replace the WTA approach (Global Disparity Network). Reflective learning.
34 Conv 112,112 ReLU Conv 112,112 Add Conv 112,112 ReLU Conv 112,112 Add Add Multilevel constant Highway Network Basic residual block: f 1 λ 0 f 2 Constant highway skip-connection: y 0 y 1 y 2 λ 1 λ 2 Outer λ-residual block:
35 Concat 2X112 ->224 FC 224-> 384 ReLU FC 384-> 384 ReLU FC 384-> 384 ReLU FC 384-> 384 ReLU FC 384-> 1 Conv 112,112 ReLU Conv 112,112 Add Conv 112,112 ReLU Conv 112,112 Add Add Conv1 3 -> 112 ReLU ReLU Conv > 112 Conv > 112 ReLU Outerblock1 Outerblock5 Multilevel constant Highway Network A: Outer λ-residual block B: Description Network f 1 λ 0 f 2 y 0 y 1 y 2 Input 11X11X3 9X9X 112 Descriptor 1X1X112 λ 1 λ 2 C: Full network in training Hybrid Loss Sigmoid BCE Dot Product Hinge
36 Global Disparity Network Matching Cost Network height Global Disparity Network width Training: height width
37 Global Disparity Network Training: height width Criterion[2]: [2] The criterion is similar to W. Luo, A. Schwing, and R. Urtasun: Efficient deep learning for stereo matching.
38 Global Disparity Network Architecture: Reflective confidence: y GT ref = 1 if argmax i y i y GT < λ 0 otherwise GT loss y ref, y ref = 1 y GT ref ln(1 y ref ) y GT ref ln(y ref )
39 Outlier detection and interpolation Pixel labeling: Where: C L (p) - the confidence score at position p of the prediction d = D L (p) C L (pd) - the confidence score at position p d of the prediction d = D L (pd) Pixel interpolation: Mismatch - the median of the nearest neighbors labeled as correct from 16 different directions. Occlusion - move left until the first correct pixel and use its value.
40 Results Benchmark results Fastest methods
41 Results Residual networks comparison
42 Results Confidence measures comparison
43 Theme II: Vision and Language A camera is viewing a scene and outputs a textural description of the activity over time The engine learns from pairs of the form: image + caption (weak supervision) One child is openning a cabinet while the other kid is talking on the phone The kid is checking the refregirator The girl is operating the microwave
44 Goal
45 Goal
46 Model 2-Way Net Loss: L = "H(x ) y" + Hˆ(y) x + H j (x) Hˆj (y) H(x ) and Hˆ(y ) are reconstruction outputs H j (x ) and Hˆj (y ) are middlenetwork representations
47 Architecture Dense layer with shared weights, followed by Highly Leaky ReLU Batch Normalization layer with variance injection Tied Dropout Layer Locally Dense Layer for high-dimensional data
48 Results Image representation using VGG representation layer and GMM-HGLMM fisher vector pooling for sentence representations from Klein et al, 2014 Recall is measured on top-1 and top-5 ranked matches
49 Examples - COCO
50 Examples - COCO
51 Task I Video Annotation Now at a restaurant a waitress serves some food
52 Task II Video Summary Input - Raw Video Output Video Summary
53 Task III Video Action Detection Detecting baseball pitch. Green: machine detection. Blue: ground truth.
54 Tessellation
55 Local Tessellation
56 Unsupervised Tessellation
57 Supervised Tessellation
58 Examples
59 Results
60 Results Video Captioning
61 Results Video Summary
62 Results - Action Detection
63 State of the Art Stereo Matching B: Description Network A: Outer λ-residual block λ ReLU Outerblock5 Conv > 112 9X9X 112 Conv > 112 Add 11X11X3 ReLU Input Add ReLU Conv 112,112 Conv 112,112 Add ReLU λ Conv 112,112 Conv 112,112 ReLU Interponet a network based interpolation Outerblock1 Interleaving Learning for Optical Flow for enabling our work Conv1 3 -> 112 Huge thank you to λ Hybrid Loss Tessellation Tessellation 2-Way Net Loss: L = " H(x ) y" + Hˆ(y) x + H j (x) Hˆj (y ) H(x ) and Hˆ(y ) are reconstruction outputs H j (x ) and Hˆj (y ) are middlenetwork representations Sigmoid BCE Dot Product Hinge FC 384-> 1 ReLU FC 384-> 384 ReLU FC 384-> 384 ReLU FC 384-> 384 ReLU FC 224-> 384 Concat 2X112 ->224 C: Full network in training Matching Text with Images Descriptor 1X1X112
CS231N Section. Video Understanding 6/1/2018
CS231N Section Video Understanding 6/1/2018 Outline Background / Motivation / History Video Datasets Models Pre-deep learning CNN + RNN 3D convolution Two-stream What we ve seen in class so far... Image
More informationImproved Stereo Matching with Constant Highway Networks and Reflective Confidence Learning
Improved Stereo Matching with Constant Highway Networks and Reflective Confidence Learning Amit Shaked 1 and Lior Wolf 1,2 1 The Blavatnik School of Computer Science, Tel Aviv University, Israel 2 Facebook
More informationDeep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon
Deep Learning For Video Classification Presented by Natalie Carlebach & Gil Sharon Overview Of Presentation Motivation Challenges of video classification Common datasets 4 different methods presented in
More informationMachine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,
Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image
More informationOptical Flow Requires Multiple Strategies (but only one network)
Optical Flow Requires Multiple Strategies (but only one network) Tal Schuster 1 Lior Wolf 1,2 David Gadot 1 1 The Blavatnik School of Computer Science, Tel Aviv University, Israel 2 Facebook AI Research
More informationUnsupervised Learning of Spatiotemporally Coherent Metrics
Unsupervised Learning of Spatiotemporally Coherent Metrics Ross Goroshin, Joan Bruna, Jonathan Tompson, David Eigen, Yann LeCun arxiv 2015. Presented by Jackie Chu Contributions Insight between slow feature
More informationDepth from Stereo. Dominic Cheng February 7, 2018
Depth from Stereo Dominic Cheng February 7, 2018 Agenda 1. Introduction to stereo 2. Efficient Deep Learning for Stereo Matching (W. Luo, A. Schwing, and R. Urtasun. In CVPR 2016.) 3. Cascade Residual
More informationDeep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks
Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin
More informationComputing the Stereo Matching Cost with CNN
University at Austin Figure. The of lefttexas column displays the left input image, while the right column displays the output of our stereo method. Examples are sorted by difficulty, with easy examples
More informationActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems (Supplementary Materials)
ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems (Supplementary Materials) Yinda Zhang 1,2, Sameh Khamis 1, Christoph Rhemann 1, Julien Valentin 1, Adarsh Kowdle 1, Vladimir
More informationStructured Prediction using Convolutional Neural Networks
Overview Structured Prediction using Convolutional Neural Networks Bohyung Han bhhan@postech.ac.kr Computer Vision Lab. Convolutional Neural Networks (CNNs) Structured predictions for low level computer
More informationPerceptual Loss for Convolutional Neural Network Based Optical Flow Estimation. Zong-qing LU, Xiang ZHU and Qing-min LIAO *
2017 2nd International Conference on Software, Multimedia and Communication Engineering (SMCE 2017) ISBN: 978-1-60595-458-5 Perceptual Loss for Convolutional Neural Network Based Optical Flow Estimation
More informationCENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan
CENG 783 Special topics in Deep Learning AlchemyAPI Week 11 Sinan Kalkan TRAINING A CNN Fig: http://www.robots.ox.ac.uk/~vgg/practicals/cnn/ Feed-forward pass Note that this is written in terms of the
More informationUsings CNNs to Estimate Depth from Stereo Imagery
1 Usings CNNs to Estimate Depth from Stereo Imagery Tyler S. Jordan, Skanda Shridhar, Jayant Thatte Abstract This paper explores the benefit of using Convolutional Neural Networks in generating a disparity
More informationMOTION ESTIMATION USING CONVOLUTIONAL NEURAL NETWORKS. Mustafa Ozan Tezcan
MOTION ESTIMATION USING CONVOLUTIONAL NEURAL NETWORKS Mustafa Ozan Tezcan Boston University Department of Electrical and Computer Engineering 8 Saint Mary s Street Boston, MA 2215 www.bu.edu/ece Dec. 19,
More informationFlow Estimation. Min Bai. February 8, University of Toronto. Min Bai (UofT) Flow Estimation February 8, / 47
Flow Estimation Min Bai University of Toronto February 8, 2016 Min Bai (UofT) Flow Estimation February 8, 2016 1 / 47 Outline Optical Flow - Continued Min Bai (UofT) Flow Estimation February 8, 2016 2
More informationOptical flow. Cordelia Schmid
Optical flow Cordelia Schmid Motion field The motion field is the projection of the 3D scene motion into the image Optical flow Definition: optical flow is the apparent motion of brightness patterns in
More informationAdaDepth: Unsupervised Content Congruent Adaptation for Depth Estimation
AdaDepth: Unsupervised Content Congruent Adaptation for Depth Estimation Introduction Supplementary material In the supplementary material, we present additional qualitative results of the proposed AdaDepth
More informationCOMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017
COMP9444 Neural Networks and Deep Learning 7. Image Processing COMP9444 17s2 Image Processing 1 Outline Image Datasets and Tasks Convolution in Detail AlexNet Weight Initialization Batch Normalization
More informationPresented at the FIG Congress 2018, May 6-11, 2018 in Istanbul, Turkey
Presented at the FIG Congress 2018, May 6-11, 2018 in Istanbul, Turkey Evangelos MALTEZOS, Charalabos IOANNIDIS, Anastasios DOULAMIS and Nikolaos DOULAMIS Laboratory of Photogrammetry, School of Rural
More informationDeep neural networks II
Deep neural networks II May 31 st, 2018 Yong Jae Lee UC Davis Many slides from Rob Fergus, Svetlana Lazebnik, Jia-Bin Huang, Derek Hoiem, Adriana Kovashka, Why (convolutional) neural networks? State of
More informationYOLO9000: Better, Faster, Stronger
YOLO9000: Better, Faster, Stronger Date: January 24, 2018 Prepared by Haris Khan (University of Toronto) Haris Khan CSC2548: Machine Learning in Computer Vision 1 Overview 1. Motivation for one-shot object
More informationDeep Learning for Computer Vision II
IIIT Hyderabad Deep Learning for Computer Vision II C. V. Jawahar Paradigm Shift Feature Extraction (SIFT, HoG, ) Part Models / Encoding Classifier Sparrow Feature Learning Classifier Sparrow L 1 L 2 L
More informationLecture 7: Semantic Segmentation
Semantic Segmentation CSED703R: Deep Learning for Visual Recognition (207F) Segmenting images based on its semantic notion Lecture 7: Semantic Segmentation Bohyung Han Computer Vision Lab. bhhanpostech.ac.kr
More informationDeep Learning in Visual Recognition. Thanks Da Zhang for the slides
Deep Learning in Visual Recognition Thanks Da Zhang for the slides Deep Learning is Everywhere 2 Roadmap Introduction Convolutional Neural Network Application Image Classification Object Detection Object
More informationInception and Residual Networks. Hantao Zhang. Deep Learning with Python.
Inception and Residual Networks Hantao Zhang Deep Learning with Python https://en.wikipedia.org/wiki/residual_neural_network Deep Neural Network Progress from Large Scale Visual Recognition Challenge (ILSVRC)
More informationFast Guided Global Interpolation for Depth and. Yu Li, Dongbo Min, Minh N. Do, Jiangbo Lu
Fast Guided Global Interpolation for Depth and Yu Li, Dongbo Min, Minh N. Do, Jiangbo Lu Introduction Depth upsampling and motion interpolation are often required to generate a dense, high-quality, and
More informationDisguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network. Nathan Sun CIS601
Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network Nathan Sun CIS601 Introduction Face ID is complicated by alterations to an individual s appearance Beard,
More informationUnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss
UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss AAAI 2018, New Orleans, USA Simon Meister, Junhwa Hur, and Stefan Roth Department of Computer Science, TU Darmstadt 2 Deep
More informationSSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang
SSD: Single Shot MultiBox Detector Author: Wei Liu et al. Presenter: Siyu Jiang Outline 1. Motivations 2. Contributions 3. Methodology 4. Experiments 5. Conclusions 6. Extensions Motivation Motivation
More informationFully Convolutional Networks for Semantic Segmentation
Fully Convolutional Networks for Semantic Segmentation Jonathan Long* Evan Shelhamer* Trevor Darrell UC Berkeley Chaim Ginzburg for Deep Learning seminar 1 Semantic Segmentation Define a pixel-wise labeling
More informationConvolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech
Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:
More informationMachine Learning 13. week
Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of
More informationDeep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia
Deep learning for dense per-pixel prediction Chunhua Shen The University of Adelaide, Australia Image understanding Classification error Convolution Neural Networks 0.3 0.2 0.1 Image Classification [Krizhevsky
More informationS7348: Deep Learning in Ford's Autonomous Vehicles. Bryan Goodman Argo AI 9 May 2017
S7348: Deep Learning in Ford's Autonomous Vehicles Bryan Goodman Argo AI 9 May 2017 1 Ford s 12 Year History in Autonomous Driving Today: examples from Stereo image processing Object detection Using RNN
More informationObject Recognition II
Object Recognition II Linda Shapiro EE/CSE 576 with CNN slides from Ross Girshick 1 Outline Object detection the task, evaluation, datasets Convolutional Neural Networks (CNNs) overview and history Region-based
More informationA Deep Learning Framework for Authorship Classification of Paintings
A Deep Learning Framework for Authorship Classification of Paintings Kai-Lung Hua ( 花凱龍 ) Dept. of Computer Science and Information Engineering National Taiwan University of Science and Technology Taipei,
More informationFinding Tiny Faces Supplementary Materials
Finding Tiny Faces Supplementary Materials Peiyun Hu, Deva Ramanan Robotics Institute Carnegie Mellon University {peiyunh,deva}@cs.cmu.edu 1. Error analysis Quantitative analysis We plot the distribution
More informationFlow-Based Video Recognition
Flow-Based Video Recognition Jifeng Dai Visual Computing Group, Microsoft Research Asia Joint work with Xizhou Zhu*, Yuwen Xiong*, Yujie Wang*, Lu Yuan and Yichen Wei (* interns) Talk pipeline Introduction
More informationTransfer Learning. Style Transfer in Deep Learning
Transfer Learning & Style Transfer in Deep Learning 4-DEC-2016 Gal Barzilai, Ram Machlev Deep Learning Seminar School of Electrical Engineering Tel Aviv University Part 1: Transfer Learning in Deep Learning
More informationFace Recognition A Deep Learning Approach
Face Recognition A Deep Learning Approach Lihi Shiloh Tal Perl Deep Learning Seminar 2 Outline What about Cat recognition? Classical face recognition Modern face recognition DeepFace FaceNet Comparison
More informationDeep Incremental Scene Understanding. Federico Tombari & Christian Rupprecht Technical University of Munich, Germany
Deep Incremental Scene Understanding Federico Tombari & Christian Rupprecht Technical University of Munich, Germany C. Couprie et al. "Toward Real-time Indoor Semantic Segmentation Using Depth Information"
More informationEVALUATION OF DEEP LEARNING BASED STEREO MATCHING METHODS: FROM GROUND TO AERIAL IMAGES
EVALUATION OF DEEP LEARNING BASED STEREO MATCHING METHODS: FROM GROUND TO AERIAL IMAGES J. Liu 1, S. Ji 1,*, C. Zhang 1, Z. Qin 1 1 School of Remote Sensing and Information Engineering, Wuhan University,
More informationEE795: Computer Vision and Intelligent Systems
EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 14 130307 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Stereo Dense Motion Estimation Translational
More informationObject Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR
Object Detection CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR Problem Description Arguably the most important part of perception Long term goals for object recognition: Generalization
More informationCNN-based Patch Matching for Optical Flow with Thresholded Hinge Embedding Loss
CNN-based Patch Matching for Optical Flow with Thresholded Hinge Embedding Loss Christian Bailer 1 Kiran Varanasi 1 Didier Stricker 1,2 Christian.Bailer@dfki.de Kiran.Varanasi@dfki.de Didier.Stricker@dfki.de
More informationInstance-aware Semantic Segmentation via Multi-task Network Cascades
Instance-aware Semantic Segmentation via Multi-task Network Cascades Jifeng Dai, Kaiming He, Jian Sun Microsoft research 2016 Yotam Gil Amit Nativ Agenda Introduction Highlights Implementation Further
More informationUsing Machine Learning for Classification of Cancer Cells
Using Machine Learning for Classification of Cancer Cells Camille Biscarrat University of California, Berkeley I Introduction Cell screening is a commonly used technique in the development of new drugs.
More informationINTRODUCTION TO DEEP LEARNING
INTRODUCTION TO DEEP LEARNING CONTENTS Introduction to deep learning Contents 1. Examples 2. Machine learning 3. Neural networks 4. Deep learning 5. Convolutional neural networks 6. Conclusion 7. Additional
More informationGeneric Object Detection Using Improved Gentleboost Classifier
Available online at www.sciencedirect.com Physics Procedia 25 (2012 ) 1528 1535 2012 International Conference on Solid State Devices and Materials Science Generic Object Detection Using Improved Gentleboost
More informationDynamic Routing Between Capsules
Report Explainable Machine Learning Dynamic Routing Between Capsules Author: Michael Dorkenwald Supervisor: Dr. Ullrich Köthe 28. Juni 2018 Inhaltsverzeichnis 1 Introduction 2 2 Motivation 2 3 CapusleNet
More informationDeep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why?
Data Mining Deep Learning Deep Learning provided breakthrough results in speech recognition and image classification. Why? Because Speech recognition and image classification are two basic examples of
More informationRegionlet Object Detector with Hand-crafted and CNN Feature
Regionlet Object Detector with Hand-crafted and CNN Feature Xiaoyu Wang Research Xiaoyu Wang Research Ming Yang Horizon Robotics Shenghuo Zhu Alibaba Group Yuanqing Lin Baidu Overview of this section Regionlet
More informationRyerson University CP8208. Soft Computing and Machine Intelligence. Naive Road-Detection using CNNS. Authors: Sarah Asiri - Domenic Curro
Ryerson University CP8208 Soft Computing and Machine Intelligence Naive Road-Detection using CNNS Authors: Sarah Asiri - Domenic Curro April 24 2016 Contents 1 Abstract 2 2 Introduction 2 3 Motivation
More informationFaceNet. Florian Schroff, Dmitry Kalenichenko, James Philbin Google Inc. Presentation by Ignacio Aranguren and Rahul Rana
FaceNet Florian Schroff, Dmitry Kalenichenko, James Philbin Google Inc. Presentation by Ignacio Aranguren and Rahul Rana Introduction FaceNet learns a mapping from face images to a compact Euclidean Space
More informationCS 1674: Intro to Computer Vision. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh November 16, 2016
CS 1674: Intro to Computer Vision Neural Networks Prof. Adriana Kovashka University of Pittsburgh November 16, 2016 Announcements Please watch the videos I sent you, if you haven t yet (that s your reading)
More informationEE795: Computer Vision and Intelligent Systems
EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 11 140311 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Motion Analysis Motivation Differential Motion Optical
More informationDeep Learning in Image Processing
Deep Learning in Image Processing Roland Memisevic University of Montreal & TwentyBN ICISP 2016 Roland Memisevic Deep Learning in Image Processing ICISP 2016 f 2? cathedral high-rise f 1 It s the features,
More informationArbitrary Style Transfer in Real-Time with Adaptive Instance Normalization. Presented by: Karen Lucknavalai and Alexandr Kuznetsov
Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization Presented by: Karen Lucknavalai and Alexandr Kuznetsov Example Style Content Result Motivation Transforming content of an image
More informationSegmentation and Tracking of Partial Planar Templates
Segmentation and Tracking of Partial Planar Templates Abdelsalam Masoud William Hoff Colorado School of Mines Colorado School of Mines Golden, CO 800 Golden, CO 800 amasoud@mines.edu whoff@mines.edu Abstract
More informationHuman Pose Estimation with Deep Learning. Wei Yang
Human Pose Estimation with Deep Learning Wei Yang Applications Understand Activities Family Robots American Heist (2014) - The Bank Robbery Scene 2 What do we need to know to recognize a crime scene? 3
More informationDeep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies
http://blog.csdn.net/zouxy09/article/details/8775360 Automatic Colorization of Black and White Images Automatically Adding Sounds To Silent Movies Traditionally this was done by hand with human effort
More informationObject detection using Region Proposals (RCNN) Ernest Cheung COMP Presentation
Object detection using Region Proposals (RCNN) Ernest Cheung COMP790-125 Presentation 1 2 Problem to solve Object detection Input: Image Output: Bounding box of the object 3 Object detection using CNN
More informationTwo-Stream Convolutional Networks for Action Recognition in Videos
Two-Stream Convolutional Networks for Action Recognition in Videos Karen Simonyan Andrew Zisserman Cemil Zalluhoğlu Introduction Aim Extend deep Convolution Networks to action recognition in video. Motivation
More informationCNNS FROM THE BASICS TO RECENT ADVANCES. Dmytro Mishkin Center for Machine Perception Czech Technical University in Prague
CNNS FROM THE BASICS TO RECENT ADVANCES Dmytro Mishkin Center for Machine Perception Czech Technical University in Prague ducha.aiki@gmail.com OUTLINE Short review of the CNN design Architecture progress
More information3D Shape Analysis with Multi-view Convolutional Networks. Evangelos Kalogerakis
3D Shape Analysis with Multi-view Convolutional Networks Evangelos Kalogerakis 3D model repositories [3D Warehouse - video] 3D geometry acquisition [KinectFusion - video] 3D shapes come in various flavors
More informationDeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs Zhipeng Yan, Moyuan Huang, Hao Jiang 5/1/2017 1 Outline Background semantic segmentation Objective,
More informationObject detection with CNNs
Object detection with CNNs 80% PASCAL VOC mean0average0precision0(map) 70% 60% 50% 40% 30% 20% 10% Before CNNs After CNNs 0% 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 year Region proposals
More informationClassification of objects from Video Data (Group 30)
Classification of objects from Video Data (Group 30) Sheallika Singh 12665 Vibhuti Mahajan 12792 Aahitagni Mukherjee 12001 M Arvind 12385 1 Motivation Video surveillance has been employed for a long time
More informationMotion Tracking and Event Understanding in Video Sequences
Motion Tracking and Event Understanding in Video Sequences Isaac Cohen Elaine Kang, Jinman Kang Institute for Robotics and Intelligent Systems University of Southern California Los Angeles, CA Objectives!
More informationDeep Learning with Tensorflow AlexNet
Machine Learning and Computer Vision Group Deep Learning with Tensorflow http://cvml.ist.ac.at/courses/dlwt_w17/ AlexNet Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification
More informationLSTM and its variants for visual recognition. Xiaodan Liang Sun Yat-sen University
LSTM and its variants for visual recognition Xiaodan Liang xdliang328@gmail.com Sun Yat-sen University Outline Context Modelling with CNN LSTM and its Variants LSTM Architecture Variants Application in
More informationA THREE LAYERED MODEL TO PERFORM CHARACTER RECOGNITION FOR NOISY IMAGES
INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONSAND ROBOTICS ISSN 2320-7345 A THREE LAYERED MODEL TO PERFORM CHARACTER RECOGNITION FOR NOISY IMAGES 1 Neha, 2 Anil Saroliya, 3 Varun Sharma 1,
More informationSingle Image Super Resolution of Textures via CNNs. Andrew Palmer
Single Image Super Resolution of Textures via CNNs Andrew Palmer What is Super Resolution (SR)? Simple: Obtain one or more high-resolution images from one or more low-resolution ones Many, many applications
More informationDeep Learning for Vision
Deep Learning for Vision Presented by Kevin Matzen Quick Intro - DNN Feed-forward Sparse connectivity (layer to layer) Different layer types Recently popularized for vision [Krizhevsky, et. al. NIPS 2012]
More informationStoryline Reconstruction for Unordered Images
Introduction: Storyline Reconstruction for Unordered Images Final Paper Sameedha Bairagi, Arpit Khandelwal, Venkatesh Raizaday Storyline reconstruction is a relatively new topic and has not been researched
More informationIntro to Deep Learning. Slides Credit: Andrej Karapathy, Derek Hoiem, Marc Aurelio, Yann LeCunn
Intro to Deep Learning Slides Credit: Andrej Karapathy, Derek Hoiem, Marc Aurelio, Yann LeCunn Why this class? Deep Features Have been able to harness the big data in the most efficient and effective
More informationINF 5860 Machine learning for image classification. Lecture 11: Visualization Anne Solberg April 4, 2018
INF 5860 Machine learning for image classification Lecture 11: Visualization Anne Solberg April 4, 2018 Reading material The lecture is based on papers: Deep Dream: https://research.googleblog.com/2015/06/inceptionism-goingdeeper-into-neural.html
More information3D Convolutional Neural Networks for Landing Zone Detection from LiDAR
3D Convolutional Neural Networks for Landing Zone Detection from LiDAR Daniel Mataruna and Sebastian Scherer Presented by: Sabin Kafle Outline Introduction Preliminaries Approach Volumetric Density Mapping
More informationObject Localization, Segmentation, Classification, and Pose Estimation in 3D Images using Deep Learning
Allan Zelener Dissertation Proposal December 12 th 2016 Object Localization, Segmentation, Classification, and Pose Estimation in 3D Images using Deep Learning Overview 1. Introduction to 3D Object Identification
More informationMulti-View 3D Object Detection Network for Autonomous Driving
Multi-View 3D Object Detection Network for Autonomous Driving Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, Tian Xia CVPR 2017 (Spotlight) Presented By: Jason Ku Overview Motivation Dataset Network Architecture
More informationCS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016
CS 2750: Machine Learning Neural Networks Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 Plan for today Neural network definition and examples Training neural networks (backprop) Convolutional
More informationArtificial Intelligence Introduction Handwriting Recognition Kadir Eren Unal ( ), Jakob Heyder ( )
Structure: 1. Introduction 2. Problem 3. Neural network approach a. Architecture b. Phases of CNN c. Results 4. HTM approach a. Architecture b. Setup c. Results 5. Conclusion 1.) Introduction Artificial
More informationLEARNING TO GENERATE CHAIRS WITH CONVOLUTIONAL NEURAL NETWORKS
LEARNING TO GENERATE CHAIRS WITH CONVOLUTIONAL NEURAL NETWORKS Alexey Dosovitskiy, Jost Tobias Springenberg and Thomas Brox University of Freiburg Presented by: Shreyansh Daftry Visual Learning and Recognition
More informationDEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla
DEEP LEARNING REVIEW Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature 2015 -Presented by Divya Chitimalla What is deep learning Deep learning allows computational models that are composed of multiple
More informationSupplementary Material for Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains
Supplementary Material for Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains Jiahao Pang 1 Wenxiu Sun 1 Chengxi Yang 1 Jimmy Ren 1 Ruichao Xiao 1 Jin Zeng 1 Liang Lin 1,2 1 SenseTime Research
More informationIs Bigger CNN Better? Samer Hijazi on behalf of IPG CTO Group Embedded Neural Networks Summit (enns2016) San Jose Feb. 9th
Is Bigger CNN Better? Samer Hijazi on behalf of IPG CTO Group Embedded Neural Networks Summit (enns2016) San Jose Feb. 9th Today s Story Why does CNN matter to the embedded world? How to enable CNN in
More informationMask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma
Mask R-CNN presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma Mask R-CNN Background Related Work Architecture Experiment Mask R-CNN Background Related Work Architecture Experiment Background From left
More informationComo funciona o Deep Learning
Como funciona o Deep Learning Moacir Ponti (com ajuda de Gabriel Paranhos da Costa) ICMC, Universidade de São Paulo Contact: www.icmc.usp.br/~moacir moacir@icmc.usp.br Uberlandia-MG/Brazil October, 2017
More informationIntroduction to Deep Learning
ENEE698A : Machine Learning Seminar Introduction to Deep Learning Raviteja Vemulapalli Image credit: [LeCun 1998] Resources Unsupervised feature learning and deep learning (UFLDL) tutorial (http://ufldl.stanford.edu/wiki/index.php/ufldl_tutorial)
More informationComputer Vision Lecture 16
Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period
More informationNon-flat Road Detection Based on A Local Descriptor
Non-flat Road Detection Based on A Local Descriptor Kangru Wang, Lei Qu, Lili Chen, Yuzhang Gu, Xiaolin Zhang Abstrct The detection of road surface and free space remains challenging for non-flat plane,
More informationCAP 6412 Advanced Computer Vision
CAP 6412 Advanced Computer Vision http://www.cs.ucf.edu/~bgong/cap6412.html Boqing Gong April 21st, 2016 Today Administrivia Free parameters in an approach, model, or algorithm? Egocentric videos by Aisha
More informationDeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material
DeepIM: Deep Iterative Matching for 6D Pose Estimation - Supplementary Material Yi Li 1, Gu Wang 1, Xiangyang Ji 1, Yu Xiang 2, and Dieter Fox 2 1 Tsinghua University, BNRist 2 University of Washington
More informationSupplementary: Cross-modal Deep Variational Hand Pose Estimation
Supplementary: Cross-modal Deep Variational Hand Pose Estimation Adrian Spurr, Jie Song, Seonwook Park, Otmar Hilliges ETH Zurich {spurra,jsong,spark,otmarh}@inf.ethz.ch Encoder/Decoder Linear(512) Table
More informationOptical Flow and Deep Learning Based Approach to Visual Odometry
Rochester Institute of Technology RIT Scholar Works Theses Thesis/Dissertation Collections 11-2016 Optical Flow and Deep Learning Based Approach to Visual Odometry Peter M. Muller pmm5983@rit.edu Follow
More informationarxiv: v2 [cs.cv] 20 Oct 2015
Computing the Stereo Matching Cost with a Convolutional Neural Network Jure Žbontar University of Ljubljana jure.zbontar@fri.uni-lj.si Yann LeCun New York University yann@cs.nyu.edu arxiv:1409.4326v2 [cs.cv]
More informationFashion Style in 128 Floats: Joint Ranking and Classification using Weak Data for Feature Extraction SUPPLEMENTAL MATERIAL
Fashion Style in 128 Floats: Joint Ranking and Classification using Weak Data for Feature Extraction SUPPLEMENTAL MATERIAL Edgar Simo-Serra Waseda University esimo@aoni.waseda.jp Hiroshi Ishikawa Waseda
More informationMachine Learning. The Breadth of ML Neural Networks & Deep Learning. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart
Machine Learning The Breadth of ML Neural Networks & Deep Learning Marc Toussaint University of Stuttgart Duy Nguyen-Tuong Bosch Center for Artificial Intelligence Summer 2017 Neural Networks Consider
More informationStudy of Residual Networks for Image Recognition
Study of Residual Networks for Image Recognition Mohammad Sadegh Ebrahimi Stanford University sadegh@stanford.edu Hossein Karkeh Abadi Stanford University hosseink@stanford.edu Abstract Deep neural networks
More information