Places Challenge 2017

Similar documents
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution and Fully Connected CRFs

Lecture 7: Semantic Segmentation

Encoder-Decoder Networks for Semantic Segmentation. Sachin Mehta

RGBd Image Semantic Labelling for Urban Driving Scenes via a DCNN

Computer Vision: Making machines see

SSD: Single Shot MultiBox Detector. Author: Wei Liu et al. Presenter: Siyu Jiang

Advanced Video Analysis & Imaging

Inception and Residual Networks. Hantao Zhang. Deep Learning with Python.

Fully Convolutional Networks for Semantic Segmentation

CSE 559A: Computer Vision

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University.

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Structured Prediction using Convolutional Neural Networks

Cascade Region Regression for Robust Object Detection

Predicting ground-level scene Layout from Aerial imagery. Muhammad Hasan Maqbool

Object Detection on Self-Driving Cars in China. Lingyun Li

Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Mask R-CNN. Kaiming He, Georgia, Gkioxari, Piotr Dollar, Ross Girshick Presenters: Xiaokang Wang, Mengyao Shi Feb. 13, 2018

COMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017

Spatial Localization and Detection. Lecture 8-1

Instance-aware Semantic Segmentation via Multi-task Network Cascades

Deep Learning with Tensorflow AlexNet

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon

Kaggle Data Science Bowl 2017 Technical Report

Elastic Neural Networks for Classification

arxiv: v1 [cs.cv] 2 Aug 2018

Generative Modeling with Convolutional Neural Networks. Denis Dus Data Scientist at InData Labs

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Deep Learning for Computer Vision with MATLAB By Jon Cherrie

Content-Based Image Recovery

CNN Basics. Chongruo Wu

Single Image Super Resolution of Textures via CNNs. Andrew Palmer

Bidirectional Recurrent Convolutional Networks for Video Super-Resolution

Fuzzy Set Theory in Computer Vision: Example 3

Multi-View 3D Object Detection Network for Autonomous Driving

Deep Residual Learning

C-Brain: A Deep Learning Accelerator

3 Object Detection. BVM 2018 Tutorial: Advanced Deep Learning Methods. Paul F. Jaeger, Division of Medical Image Computing

Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization. Presented by: Karen Lucknavalai and Alexandr Kuznetsov

Automatic Thoracic CT Image Segmentation using Deep Convolutional Neural Networks. Xiao Han, Ph.D.

CENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan

MULTI-SCALE OBJECT DETECTION WITH FEATURE FUSION AND REGION OBJECTNESS NETWORK. Wenjie Guan, YueXian Zou*, Xiaoqun Zhou

Learning Fully Dense Neural Networks for Image Semantic Segmentation

arxiv: v1 [cs.cv] 29 Sep 2016

CS 1674: Intro to Computer Vision. Object Recognition. Prof. Adriana Kovashka University of Pittsburgh April 3, 5, 2018

Fei-Fei Li & Justin Johnson & Serena Yeung

Know your data - many types of networks

A MULTI-RESOLUTION FUSION MODEL INCORPORATING COLOR AND ELEVATION FOR SEMANTIC SEGMENTATION

Mask R-CNN. presented by Jiageng Zhang, Jingyao Zhan, Yunhan Ma

YOLO9000: Better, Faster, Stronger

Martian lava field, NASA, Wikipedia

REGION AVERAGE POOLING FOR CONTEXT-AWARE OBJECT DETECTION

Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. By Joa õ Carreira and Andrew Zisserman Presenter: Zhisheng Huang 03/02/2018

Fuzzy Set Theory in Computer Vision: Example 3, Part II

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR

Tutorial on Keras CAP ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY

Optimizing Object Detection:

A Neural Algorithm of Artistic Style. Leon A. Gatys, Alexander S. Ecker, Matthias Bethge

In-Place Activated BatchNorm for Memory- Optimized Training of DNNs

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

Fully Convolutional Network for Depth Estimation and Semantic Segmentation

Predicting Depth, Surface Normals and Semantic Labels with a Common Multi-Scale Convolutional Architecture David Eigen, Rob Fergus

LSTM and its variants for visual recognition. Xiaodan Liang Sun Yat-sen University

Deep condolence to Professor Mark Everingham

INTRODUCTION TO DEEP LEARNING

HENet: A Highly Efficient Convolutional Neural. Networks Optimized for Accuracy, Speed and Storage

Using Machine Learning for Classification of Cancer Cells

Extend the shallow part of Single Shot MultiBox Detector via Convolutional Neural Network

AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015)

Object Detection. Part1. Presenter: Dae-Yong

One Network to Solve Them All Solving Linear Inverse Problems using Deep Projection Models

Direct Multi-Scale Dual-Stream Network for Pedestrian Detection Sang-Il Jung and Ki-Sang Hong Image Information Processing Lab.

Channel Locality Block: A Variant of Squeeze-and-Excitation

Feature-Fused SSD: Fast Detection for Small Objects

Deep Learning for Object detection & localization

Deep Bi-Dense Networks for Image Super-Resolution

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies

Deep learning for object detection. Slides from Svetlana Lazebnik and many others

Deconvolutions in Convolutional Neural Networks

SEMANTIC SEGMENTATION AVIRAM BAR HAIM & IRIS TAL

Rotation Invariance Neural Network

Hybrid Learning Framework for Large-Scale Web Image Annotation and Localization

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

Dynamic Routing Between Capsules

Rich feature hierarchies for accurate object detection and semant

Yield Estimation using faster R-CNN

Multi-Glance Attention Models For Image Classification

Object detection with CNNs

FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction

Computer Vision Lecture 16

3D Densely Convolutional Networks for Volumetric Segmentation. Toan Duc Bui, Jitae Shin, and Taesup Moon

Deep Convolutional Neural Network using Triplet of Faces, Deep Ensemble, and Scorelevel Fusion for Face Recognition

arxiv: v1 [cs.cv] 11 Apr 2018

Detecting Bone Lesions in Multiple Myeloma Patients using Transfer Learning

RSRN: Rich Side-output Residual Network for Medial Axis Detection

CIS680: Vision & Learning Assignment 2.b: RPN, Faster R-CNN and Mask R-CNN Due: Nov. 21, 2018 at 11:59 pm

Supplementary Materials for Learning to Parse Wireframes in Images of Man-Made Environments

Dense Image Labeling Using Deep Convolutional Neural Networks

Transcription:

Places Challenge 2017 Scene Parsing Task CASIA_IVA_JD Jun Fu, Jing Liu, Longteng Guo, Haijie Tian, Fei Liu, Hanqing Lu Yong Li, Yongjun Bao, Weipeng Yan National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences Model Team, Business Growth BU, JingDong (JD)

2 Contents Data analysis Stacked Deconvolutional Network (SDN) Ensemble modeling

3 Data analysis Basic information of the data 20210 images for training, and 2000 images for validation, 3352 images for testing 150 labels including 35 stuff concepts and 115 discrete objects Further statistics For each label of training data: The number of labeled images: 42(escalator) ~ 11664(wall) For each image of training data: The number of labels: 0 ~ 31 average: 8.17 Image width and height: Training data: min size: 96 x 130 max size: 2100 x 2100 Validation data: min size: 200 x 200 max size: 1600 x 1600

4 Data analysis Challenge Diverse and complex scenes Contains various objects in some scenes Background clutter, light condition change, deformation,

5 Data analysis Challenge Diverse and complex images Similar semantic label Field/Earth/Grass, Desk/Table, Mountain/Hill, Grass Field

6 Data analysis Challenge Diverse and complex images Similar semantic label Multi-scale information Image size Stuff and objects in images

7 Contents Data analysis Stacked Deconvolutional Network (SDN) Ensemble modeling

8 Related Work Deconvolutional network Common network structure for pixel-level vision task Encoder module: capture context Decoder module: recover spatial information DeconvNet, SegNet, Light-DCNN Drawbacks Limited learning ability ( VGG ) Difficulty in training

9 Stacked Deconvolutional Network The architecture of stacked deconvolutional network DenseNet SDN unit Design an efficient shallow deconvolutional network (called as SDN unit), stack multiple SDN units one by one with dense connections Other designs: Intra-unit connections inter-unit connections hierarchical supervision SDN link: https://arxiv.org/pdf/1708.04943.pdf

10 Stacked Deconvolutional Network The architecture of SDN unit Convolutional layer Compression layer F n i 1 i F n 1 P n i Q n i F n i Max-pooling layer H 1 k b The i th downsampling block Deconvolutional layer F n i 1 O n i Q n i F n i Classification layer a SDN unit c The i th upsampling block Encoder module: two downsampling blocks Enlarge the receptive fields of the Network Decoder module: two upsampling blocks Achieve a more refined reconstruction of the feature maps SDN link: https://arxiv.org/pdf/1708.04943.pdf

11 Stacked Deconvolutional Network Intra-unit connections Inter-unit connection F n i 1 i F n 1 H 1 k P n i Q n i b The i th downsampling block F n i Intra-unit connection F n i 1 O n i Q n i F n i Concatenation a SDN unit c The ith upsampling block Intra-unit connections Dense connections in a downsampling/upsampling block Beneficial to the flow of information and gradient propagation throughout the network SDN link: https://arxiv.org/pdf/1708.04943.pdf

12 Stacked Deconvolutional Network Inter-unit connections Inter-unit connection DenseNet Intra-unit connection SDN unit Concatenation Inter-unit connections Reuse the multi-scale information across different units Two types of inter-unit skip connections Between any two adjacent SDN units The skip connections from the first SDN units to others SDN link: https://arxiv.org/pdf/1708.04943.pdf

13 Stacked Deconvolutional Network Hierarchical supervision Convolutional layer Compression layer Max-pooling layer Deconvolutional layer Classification layer Hierarchical supervision Assist training Guarantee the discrimination of the feature maps SDN link: https://arxiv.org/pdf/1708.04943.pdf

14 Stacked Deconvolutional Network Some Training settings: Data augmentation scale ratio augmentation (s=[0.5 0.75 1 1.25 1.5]) W = W s ; H = H s aspect ratio augmentation (a=[0.85 1 1.15]) W = W/a ; H = H a resize and random crop Large cropsize Proper learning rate 2.5e-4 and iteration number 100K Testing scheme: Resize the image and testing with sliding window crop Multi scale test

15 Stacked Deconvolutional Network SDN_M2 DenseNet SDN_M2 result by mean IoU / pixel accuracy Val Data: 44.57/81.22

16 Contents Data analysis Stacked Deconvolutional Network (SDN) Ensemble modeling

17 Ensemble modeling Deeplabv3+ Some improvements on deeplabv3 Upsampling module scale ResNet 101 Global pooling 1/8 ASPP+ conv Global pooling+conv+si gmod Upsampling module: similar to RefineNet Deeplabv3+ result Val Data: 44.25/81.02 1/8 scale 1X1X150 1/8 1/4

18 Ensemble modeling SDN: 44.57/81.22 Deeplabv3+: 44.25/81.02 ResNet38: 44.07/81.07 By averaging the results of these models, the score increased to 46.59/82.23 Deeplabv3+ and ResNet38 adopt training settings and test scheme as the same as SDN, and initialize networks with the models pretrained on ImageNet.

19 Ensemble modeling Some scene parsing results

20 Ensemble modeling Some scene parsing results

21 Ensemble modeling Some scene parsing results

22 Ensemble modeling Some scene parsing results

23 Contents Data analysis Stacked Deconvolutional Network (SDN) Ensemble modeling

24 Thanks Any questions, please contact the author of the work Email: jliu@nlpr.ia.ac.cn