Semantic Segmentation

Similar documents
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs

Presentation Outline. Semantic Segmentation. Overview. Presentation Outline CNN. Learning Deconvolution Network for Semantic Segmentation 6/6/16

Fully Convolutional Networks for Semantic Segmentation

Lecture 7: Semantic Segmentation

Deconvolutions in Convolutional Neural Networks

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta

arxiv: v1 [cs.cv] 31 Mar 2016

Learning Deep Structured Models for Semantic Segmentation. Guosheng Lin

TEXT SEGMENTATION ON PHOTOREALISTIC IMAGES

Spatial Localization and Detection. Lecture 8-1

Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia

Object Detection on Self-Driving Cars in China. Lingyun Li

Team G-RMI: Google Research & Machine Intelligence

Structured Prediction using Convolutional Neural Networks

Dense Image Labeling Using Deep Convolutional Neural Networks

MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK. Wenjie Guan, YueXian Zou*, Xiaoqun Zhou

Rich feature hierarchies for accurate object detection and semantic segmentation

TRANSPARENT OBJECT DETECTION USING REGIONS WITH CONVOLUTIONAL NEURAL NETWORK

JOINT DETECTION AND SEGMENTATION WITH DEEP HIERARCHICAL NETWORKS. Zhao Chen Machine Learning Intern, NVIDIA

Martian lava field, NASA, Wikipedia

Semantic Segmentation. Zhongang Qi

Computer Vision Lecture 16

Computer Vision Lecture 16

SSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang

Efficient Segmentation-Aided Text Detection For Intelligent Robots

HIERARCHICAL JOINT-GUIDED NETWORKS FOR SEMANTIC IMAGE SEGMENTATION

A MULTI-RESOLUTION FUSION MODEL INCORPORATING COLOR AND ELEVATION FOR SEMANTIC SEGMENTATION

Conditional Random Fields as Recurrent Neural Networks

A Deep Learning Framework for Authorship Classification of Paintings

Yiqi Yan. May 10, 2017

Pixel Offset Regression (POR) for Single-shot Instance Segmentation

Deep learning for object detection. Slides from Svetlana Lazebnik and many others

arxiv: v4 [cs.cv] 6 Jul 2016

RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon

Classification of objects from Video Data (Group 30)

Regionlet Object Detector with Hand-crafted and CNN Feature

EE-559 Deep learning Networks for semantic segmentation

Two-Stream Convolutional Networks for Action Recognition in Videos

Constrained Convolutional Neural Networks for Weakly Supervised Segmentation. Deepak Pathak, Philipp Krähenbühl and Trevor Darrell

arxiv: v1 [cs.cv] 14 Dec 2015

arxiv: v2 [cs.cv] 18 Jul 2017

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR

CS 534: Computer Vision Texture

Learning Fully Dense Neural Networks for Image Semantic Segmentation

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Computer Vision Lecture 16

Hand-Object Interaction Detection with Fully Convolutional Networks

Fine-tuning Pre-trained Large Scaled ImageNet model on smaller dataset for Detection task

Rich feature hierarchies for accurate object detection and semantic segmentation

Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture David Eigen, Rob Fergus

RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation

CS 1674: Intro to Computer Vision. Object Recognition. Prof. Adriana Kovashka University of Pittsburgh April 3, 5, 2018

Object Detection Based on Deep Learning

Towards Large-Scale Semantic Representations for Actionable Exploitation. Prof. Trevor Darrell UC Berkeley

Analysis: TextonBoost and Semantic Texton Forests. Daniel Munoz Februrary 9, 2009

Multi-Glance Attention Models For Image Classification

Semi-supervised Semantic Segmentation using Generative Adversarial Networks

arxiv: v1 [cs.cv] 8 Mar 2017 Abstract

Project 3 Q&A. Jonathan Krause

(Deep) Learning for Robot Perception and Navigation. Wolfram Burgard

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University.

RGBD Occlusion Detection via Deep Convolutional Neural Networks

arxiv: v1 [cs.cv] 14 Nov 2014

AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015)

Advanced Video Analysis & Imaging

Return of the Devil in the Details: Delving Deep into Convolutional Nets

Machine Learning 13. week

CAP 6412 Advanced Computer Vision

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

Deep Incremental Scene Understanding. Federico Tombari & Christian Rupprecht Technical University of Munich, Germany

Know your data - many types of networks

Computer Vision: Making machines see

Cascade Region Regression for Robust Object Detection

ImageNet Classification with Deep Convolutional Neural Networks

Deep Learning with Tensorflow AlexNet

CEA LIST s participation to the Scalable Concept Image Annotation task of ImageCLEF 2015

Segmentation as Selective Search for Object Recognition in ILSVRC2011

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Foreground Segmentation for Anomaly Detection in Surveillance Videos Using Deep Residual Networks

PARTIAL STYLE TRANSFER USING WEAKLY SUPERVISED SEMANTIC SEGMENTATION. Shin Matsuo Wataru Shimoda Keiji Yanai

Mask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma

Object Recognition II

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Semi Supervised Semantic Segmentation Using Generative Adversarial Network

Amodal and Panoptic Segmentation. Stephanie Liu, Andrew Zhou

Ryerson University CP8208. Soft Computing and Machine Intelligence. Naive Road-Detection using CNNS. Authors: Sarah Asiri - Domenic Curro

MULTI-LEVEL 3D CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION SAMBIT GHADAI XIAN LEE ADITYA BALU SOUMIK SARKAR ADARSH KRISHNAMURTHY

Unsupervised Deep Learning. James Hays slides from Carl Doersch and Richard Zhang

arxiv: v1 [cs.cv] 24 May 2016

LEARNING DENSE CONVOLUTIONAL EMBEDDINGS

Convolutional Neural Networks

Additive Manufacturing Defect Detection using Neural Networks. James Ferguson May 16, 2016

NVIDIA DLI HANDS-ON TRAINING COURSE CATALOG

RON: Reverse Connection with Objectness Prior Networks for Object Detection

arxiv: v1 [cs.cv] 11 Apr 2018

Dataset Augmentation with Synthetic Images Improves Semantic Segmentation

REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION

OBJECT DETECTION HYUNG IL KOO

Convolutional Networks in Scene Labelling

Transcription:

Semantic Segmentation UCLA:https://goo.gl/images/I0VTi2

OUTLINE Semantic Segmentation Why? Paper to talk about: Fully Convolutional Networks for Semantic Segmentation. J. Long, E. Shelhamer, and T. Darrell, CVPR 2015 Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille. ICLR 2015

What is Semantic Segmentation Lena lena mirror

What is Semantic Segmentation Goal: Partition the image into semantically meaningful parts, and classify each part >Patch-wise Recognizing and delineating objects in an image Classifying each pixel in the image >Pixel-wise

Why Semantic Segmentation? To let robots segment objects so that they can grasp them https://goo.gl/images/6xaqam

Why Semantic Segmentation? Useful tool for editing images, visual effects CVFX Lecture1: https://www.youtube.com/watch?v=re-hvtytt-i

Why Semantic Segmentation? Autonomous Driving, to differentiate pedestrian and background Citydataset

Fully Convolutional Networks for Semantic Segmentation. J. Long, E. Shelhamer, and T. Darrell, CVPR 2015

Fully Convolutional Networks for Semantic Segmentation. J. Long, E. Shelhamer, and T. Darrell, CVPR 2015 Usual convolutional networks Fully convolutional networks

To understand Fully Convolutional 2015, Berkeley Vision: http://tutorial.caffe.berkeleyvision.org/

To understand Fully Convolutional A typical CNN 2015, Berkeley Vision: http://tutorial.caffe.berkeleyvision.org/

To understand Fully Convolutional A classification CNN A FCN 2015, Berkeley Vision: http://tutorial.caffe.berkeleyvision.org/

FCN: segmentation that combines layers of hierarchy and refines the spatial precision of the output.

Segmentation Architecture 1. ILSVRC classifiers, in-network up sampling and a pixel-wise loss. 2. Add skips between layers to fuse coarse, semantic and local, appearance 3. Dense predictions, pixel-wise prediction

Some Tricks skip layers

Some Tricks skip layers refinement

Some Tricks Interpolation 1. Up-sampling is performed in-network for end-to-end learning by back-propagation from the pixel wise loss. 2. The deconvolution filter in such a layer can be learned.

Some results: PASCAL VOC NYUDv2

Conclusion 1. Fine-tuning from classification to segmentation gives reasonable predictions for each net. 2. Learning through up-sampling combined with the skip layer fusion to be more effective and efficient

Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille. ICLR 2015

Paper s main idea 1.Use CNN to generate a rough prediction of segmentation (smooth, blurry heat map) 2.Refine this prediction with a conditional random field (CRF)

Why are CNNs insufficient? Good for high-level vision tasks like classification, bad for low level tasks like segmentation. Problem: subsampling Problem: spatial invariance (shared kernel weights) Solution: fully connected CRF

Solution: fully connected CRF Holes algorithms Solution: fully connected CRF

Solution: fully connected CRF CRF Randomly choose points and give initial label

CRF Energy Function

Global Map

Comparison to state-of-the-art

Comparison to state-of-the-art

Comparison to state-of-the-art

Successful Cases

Failure Cases

Conclusion Modify the CNN architecture to become less spatially invariant. Use the CNN to compute a rough score map. Use a fully connected CRF to sharpen the score

Experiments Intel Xeon E5-2670 NVIDIA GPU Caffe VOC_FCN_32s Python Cuda8.0

Data_preparation load image, switch to BGR, subtract mean, and make dims C x H x W for Caffe

Experiment 26.862607 1.238836

Experiment 39.570141 1.738234

Experiment 32.238836 1.238836

Experiment 39.570141 1.5334832

Experiment 27.895173 1.239234

Conclusion 1.Their network is very fast even when dealing with high resolution image, and GPU is at least 20 times faster than CPU. 2. The algorithms show good performance towards images when the objects are either well-separated or overlapped with each other 3. The background of image like sky, grass has a big influence on the segmentation. Better performance could be expected with their FCN_8s, and detailed performance on validation dataset needs to be checked.

Thanks