VISION FOR AUTOMOTIVE DRIVING

Similar documents
CONTENT ENGINEERING & VISION LABORATORY. Régis Vinciguerra

Towards Fully-automated Driving. tue-mps.org. Challenges and Potential Solutions. Dr. Gijs Dubbelman Mobile Perception Systems EE-SPS/VCA

CODE ANALYSES FOR NUMERICAL ACCURACY WITH AFFINE FORMS: FROM DIAGNOSIS TO THE ORIGIN OF THE NUMERICAL ERRORS. Teratec 2017 Forum Védrine Franck

Deep learning for dense per-pixel prediction. Chunhua Shen The University of Adelaide, Australia

Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing Supplementary Material

SDN-BASED CONFIGURATION SOLUTION FOR IEEE TIME SENSITIVE NETWORKING (TSN)

OpenStreetSLAM: Global Vehicle Localization using OpenStreetMaps

PAPYRUS FUTURE. CEA Papyrus Team

Joint Object Detection and Viewpoint Estimation using CNN features

Synscapes A photorealistic syntehtic dataset for street scene parsing Jonas Unger Department of Science and Technology Linköpings Universitet.

Deep Supervision with Shape Concepts for Occlusion-Aware 3D Object Parsing

DEEP NEURAL NETWORKS CHANGING THE AUTONOMOUS VEHICLE LANDSCAPE. Dennis Lui August 2017

Training models for road scene understanding with automated ground truth Dan Levi

Collaborative Mapping with Streetlevel Images in the Wild. Yubin Kuang Co-founder and Computer Vision Lead

Realtime Object Detection and Segmentation for HD Mapping

Turning an Automated System into an Autonomous system using Model-Based Design Autonomous Tech Conference 2018

Training models for road scene understanding with automated ground truth Dan Levi

Measuring the World: Designing Robust Vehicle Localization for Autonomous Driving. Frank Schuster, Dr. Martin Haueis

ORB SLAM 2 : an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras

From 3D descriptors to monocular 6D pose: what have we learned?

W4. Perception & Situation Awareness & Decision making

YOLO9000: Better, Faster, Stronger

Creating Affordable and Reliable Autonomous Vehicle Systems

Using Faster-RCNN to Improve Shape Detection in LIDAR

Augmented Reality, Advanced SLAM, Applications

arxiv: v1 [cs.cv] 27 Mar 2019

Simulation: A Must for Autonomous Driving

Deep Learning for Virtual Shopping. Dr. Jürgen Sturm Group Leader RGB-D

Self Driving. DNN * * Reinforcement * Unsupervised *

Deep Neural Network Enhanced VSLAM Landmark Selection

Lost! Leveraging the Crowd for Probabilistic Visual Self-Localization

Efficient Application Mapping on CGRAs Based on Backward Simultaneous Scheduling / Binding and Dynamic Graph Transformations

HIGH PERFORMANCE LARGE EDDY SIMULATION OF TURBULENT FLOWS AROUND PWR MIXING GRIDS

Deep Incremental Scene Understanding. Federico Tombari & Christian Rupprecht Technical University of Munich, Germany

Weighted Local Bundle Adjustment and Application to Odometry and Visual SLAM Fusion

A Whole New World of Mapping and Sensing: Uses from Asset to Management to In Vehicle Sensing for Collision Avoidance

CS 395T Numerical Optimization for Graphics and AI (3D Vision) Qixing Huang August 29 th 2018

CS231N Section. Video Understanding 6/1/2018

DL Tutorial. Xudong Cao

RECURRENT NEURAL NETWORKS

Dense Tracking and Mapping for Autonomous Quadrocopters. Jürgen Sturm

Computer Vision: Making machines see

A Study of Vehicle Detector Generalization on U.S. Highway

arxiv: v1 [cs.cv] 22 Mar 2017

Multi-View 3D Object Detection Network for Autonomous Driving

VINet: Visual-Inertial Odometry as a Sequence-to-Sequence Learning Problem

REAL-TIME ADAPTIVE IMAGING FOR ULTRASONIC NONDESTRUCTIVE TESTING OF STRUCTURES WITH IRREGULAR SHAPES

Application questions. Theoretical questions

Characterization and Benchmarking of Deep Learning. Natalia Vassilieva, PhD Sr. Research Manager

P I X E V I A : A I B A S E D, R E A L - T I M E C O M P U T E R V I S I O N S Y S T E M F O R D R O N E S

COMPUTER VISION Multi-view Geometry

ASSEMBLY OF THE IFMIF CRYOMODULE

Regionlet Object Detector with Hand-crafted and CNN Feature

Cloud-based Large Scale Video Analysis

GNSS Multipath Signals: Mitigation or Constructive Use for Robust Navigation in Urban Canyons?

Object Detection. CS698N Final Project Presentation AKSHAT AGARWAL SIDDHARTH TANWAR

Brainchip OCTOBER

Object Detection on Street View Images: from Panoramas to Geotags

arxiv: v1 [cs.cv] 12 Feb 2018

Mapping with Dynamic-Object Probabilities Calculated from Single 3D Range Scans

Paris-Lille-3D: A Point Cloud Dataset for Urban Scene Segmentation and Classification

Stable Vision-Aided Navigation for Large-Area Augmented Reality

Transfer Learning. Style Transfer in Deep Learning

AUTOMATIC 3D HUMAN ACTION RECOGNITION Ajmal Mian Associate Professor Computer Science & Software Engineering

Planetary Rover Absolute Localization by Combining Visual Odometry with Orbital Image Measurements

DATA-MANAGEMENT DIRECTORY FOR OPENMP 4.0 AND OPENACC

H2020 Space Robotic SRC- OG4

Designing a Pick and Place Robotics Application Using MATLAB and Simulink

CAP 6412 Advanced Computer Vision

Machine learning based automatic extrinsic calibration of an onboard monocular camera for driving assistance applications on smart mobile devices

Solid State LiDAR for Ubiquitous 3D Sensing

Presented at the FIG Congress 2018, May 6-11, 2018 in Istanbul, Turkey

The Hilbert Problems of Computer Vision. Jitendra Malik UC Berkeley & Google, Inc.

Flow-Based Video Recognition

CSE 527: Introduction to Computer Vision

Bilinear Models for Fine-Grained Visual Recognition

Machine Learning for Big Fishery Visual Data

Deep Models for 3D Reconstruction

Designing a software framework for automated driving. Dr.-Ing. Sebastian Ohl, 2017 October 12 th

MULTI-LEVEL 3D CONVOLUTIONAL NEURAL NETWORK FOR OBJECT RECOGNITION SAMBIT GHADAI XIAN LEE ADITYA BALU SOUMIK SARKAR ADARSH KRISHNAMURTHY

RTMaps Embedded facilitating development and testing of complex HAD software on modern ADAS platforms

Exploiting noisy web data for largescale visual recognition

Pattern Recognition for Autonomous. Pattern Recognition for Autonomous. Driving. Freie Universität t Berlin. Raul Rojas

Evaluation of a laser-based reference system for ADAS

Object Geolocation from Crowdsourced Street Level Imagery

CubeSLAM: Monocular 3D Object Detection and SLAM without Prior Models

People Detection and Video Understanding

Visual features detection based on deep neural network in autonomous driving tasks

Simultaneous Localization and Mapping (SLAM)

Learning Semantic Video Captioning using Data Generated with Grand Theft Auto

Multi-Level Fusion based 3D Object Detection from Monocular Images

ELEMENTTYPES CONFIGURATION FRAMEWORK

NVIDIA AI BRAIN OF SELF DRIVING AND HD MAPPING. September 13, 2016

Neue Verfahren der Bildverarbeitung auch zur Erfassung von Schäden in Abwasserkanälen?

MIOVISION DEEP LEARNING TRAFFIC ANALYTICS SYSTEM FOR REAL-WORLD DEPLOYMENT. Kurtis McBride CEO, Miovision

Real-time Object Detection CS 229 Course Project

Korea ICT Market Overview. Yoonmi Kim Finpro Korea

Visual SLAM. An Overview. L. Freda. ALCOR Lab DIAG University of Rome La Sapienza. May 3, 2016

Dense 3D Reconstruction from Autonomous Quadrocopters

Project Overview Mapping Technology Assessment for Connected Vehicle Highway Network Applications

Transcription:

VISION FOR AUTOMOTIVE DRIVING French Japanese Workshop on Deep Learning & AI, Paris, October 25th, 2017 Quoc Cuong PHAM, PhD Vision and Content Engineering Lab

AI & MACHINE LEARNING FOR ADAS AND SELF-DRIVING CARS Sensor Fusion Trajectory Planning Localization Control Strategy Situation Understanding Driver Model Supervised, unsupervised, reinforcement learning SLAM, pattern recognition, classification, clustering, segmentation, motion planning 2

LOCALIZATION IN LARGE ENVIRONMENTS 3

VISUAL SLAM Localization in an arbitray coordinate system Unknown scale in monocular SLAM Error accumulation (drift) New image Interest point detection Matching Pose computation No Key frames? Triangulation Bundle Adjustment 4 4

CONSTRAINED SLAM FRAMEWORK New image odometer Interest point detection Matching Pose computation No IMU Key frames? GPS Triangulation Constrained Bundle Adjustment GIS 5 5

LOCALIZATION IN A KNOWN ENVIRONMENT Off-line On-line Camera pose 6

LOCALIZATION WITH A DATABASE 1) DATABASE CONSTRUCTION Two step process: VSLAM constrained to GPS/DEM Inacurracies due to the gps bias Refine the reconstruction with a global bundle adjsutment constrained to Buildings/DEM 7 7

LOCALIZATION WITH A DATABASE 2) ON-LINE LOCALIZATION Database 3D/3D correspondences Mapping Thread Matching with database Local Constrained Bundle Adjustment Relocalization Thread 3D point cloud 2D/3D correspondences + camera poses Triangulation 3D point cloud New image Interest point detection Matching with Slam Pose computation Tracking Thread 8

LOCALIZATION FOR AIDED NAVIGATION 9 9

AUGMENTED REALITY 10 10

IMPROVING VSLAM / GPS FUSION IN URBAN ENVIRONMENT Exploit Building models (GIS) Estimate the local bias of the GPS data Use corrected GPS data in VSLAM/GPS fusion (constrained bundle adjustment) GPS data with bias GPS data without bias 11 Soutenance de thèse LARNAOUT Dorra 11

IMPROVING VSLAM / GPS FUSION IN URBAN ENVIRONMENT 12

IMPROVING VSLAM / GPS FUSION IN URBAN ENVIRONMENT 13

LOCALIZATION GEOLOCALIZATION TECHNOLOGY PERFORMANCE Incremental motion (SLAM) + Constraint CAD Model tracking Absolute positioning by viewpoints recognition Sensor fusion (2D/3D camera, GPS, IMU, Odometer, Lidar) Accuracy <1% error in translation Execution runtime Challenge KITTI (SLAM) 2x faster for equal quality Low energy 3h on a Microsoft Surface Pro Multi-platform Win/Linux/Android/MacOS X86/ARM 14

ONGOING DEVELOPMENTS Improve buildings/3d point cloud association Building occluding by tree or car parked => wrong associations Scene parsing with deep convolutional networks 15 15

OBJECT DETECTION FOR SITUATION UNDERSTANDING 16

DEEP MANTA A vision based CEA-LIST Technology for detection and 2D/3D analysis of vehicles from a monocular image A MANy-TAsk approach based on Deep neural networks to simultaneously perform: (1) Vehicle detection, (2) 3D dimension estimation, (3) Orientation estimation, (4) 3D localization, (5) Part localization, (6) Part visibility characterization. 17

DEEP MANTA FEATURES Vehicle detection / Orientation estimation / 3D localization 2D bounding box Orientation / 3D position 3D box / distance to the camera 18

DEEP MANTA FEATURES Vehicle part localization (ex: high lights, mirrors, ) 19

DEEP MANTA FEATURES Detection of occluded vehicles/ Visibility characterization 20

DEEP MANTA FEATURES 3D template recognition (3D dimensions) 21

STATE OF THE ART STATE OF THE ART: OBJECT RECOGNITION Deep Convolutionnal Neural Networks (CNN) Classification Object detection Fine-grained object recognition Semantic segmentation Object segmentation Success of object proposals for object detection Less computing time / memory Region of Interest to classify Selective search 2013, MCG 2014, BING 2014, Edge Boxes 2014, 22

OBJECT PROPOSALS STATE OF THE ART : OBJECT PROPOSAL Selective Search for Object Recognition, IJCV 2013 3D Object Proposals for Accurate Object Class Detection, NIPS 2015 Monocular 3D Object Detection for Autonomous Driving, CVPR 2016 23

FASTER R-CNN Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, NIPS 2015 One neural network used for both object proposal and classification 24

DEEP MANTA APPROACH 25

SMART DATA ANNOTATION Required labels to learn the Deep Manta network 2D bounding boxes 2D part coordinates (visible / hidden) Part visibility Vehicle 3D dimensions How can we annotate these data in a short time? Smart annotation process using 3D models to annotate Kitti images Automatic labellisation 26

PERFORMANCE: KITTI BENCHMARK International benchmark for traffic scene analysis (autonomous driving) Participation of CEA-LIST (01/2017) Task : vehicle detection (127 competitors) Task : vehicle orientation estimation(64 competitors) Competitors : NVIDIA, Stanford University, Baidu, NEC Laboratories, UCSD, University of Toronto Deep MANTA approach Dataset: 7481 training images, 7518 test images Data augmentation : + Flipping images Deep tested architectures: GoogLeNet, VGG16 End-to-end learning (~ 2 day for training) 27

PERFORMANCE: KITTI BENCHMARK OBJECT DETECTION 127 participants -0,51% +6,37% 4th / 107 in 08/2016 28

PERFORMANCE: KITTI BENCHMARK ORIENTATION ESTIMATION 1st since 08/2016 64 participants 29

DEEP MANTA 3D LOCALIZATION PERFORMANCE 1 meter 2 meters Database Method Type Time (s) Easy Moderate Hard 3DOP Stereo 3.0 81.97 68.15 59.85 KITTI VAL1 Mono3D 4.2 48.31 38.98 34.25 Ours GoogleNet Mono 0.7 65.71 53.79 47.21 Ours VGG16 2.0 69.72 54.44 47.77 3DVP 40.0 45.61 34.28 27.72 KITTI VAL2 Ours GoogleNet Mono 0.7 70.90 58.05 49.00 Ours VGG16 2.0 66.88 53.17 44.40 Database Method Type Time (s) Easy Moderate Hard 3DOP Stereo 3.0 91.46 81.63 72.97 KITTI VAL1 Mono3D 4.2 74.77 60.91 54.24 Ours GoogleNet Mono 0.7 89.29 75.92 67.28 Ours VGG16 2.0 91.01 76.38 67.77 KITTI VAL2 3DVP 40.0 65.73 54.60 45.62 Ours GoogleNet Mono 0.7 90.12 77.02 66.09 Ours VGG16 2.0 88.32 74.31 63.62 30

VEHICLE DETECTION 31

FINE-GRAINED RECOGNITION: MAKE AND MODEL 32

MULTI-CLASS OBJECT DETECTION 33

PERSPECTIVES GENERALIZATION SMART DATA GENERATION ENHANCED FEATURES EMBEDDED COMPUTING 34

THANK YOU FOR YOUR ATTENTION quoc-cuong.pham@cea.fr Commissariat à l énergie atomique et aux énergies alternatives Institut List CEA SACLAY NANO-INNOV BAT. 861 PC142 91191 Gif-sur-Yvette Cedex - FRANCE www-list.cea.fr Établissement public à caractère industriel et commercial RCS Paris B 775 685 019