VISION FOR AUTOMOTIVE DRIVING French Japanese Workshop on Deep Learning & AI, Paris, October 25th, 2017 Quoc Cuong PHAM, PhD Vision and Content Engineering Lab
AI & MACHINE LEARNING FOR ADAS AND SELF-DRIVING CARS Sensor Fusion Trajectory Planning Localization Control Strategy Situation Understanding Driver Model Supervised, unsupervised, reinforcement learning SLAM, pattern recognition, classification, clustering, segmentation, motion planning 2
LOCALIZATION IN LARGE ENVIRONMENTS 3
VISUAL SLAM Localization in an arbitray coordinate system Unknown scale in monocular SLAM Error accumulation (drift) New image Interest point detection Matching Pose computation No Key frames? Triangulation Bundle Adjustment 4 4
CONSTRAINED SLAM FRAMEWORK New image odometer Interest point detection Matching Pose computation No IMU Key frames? GPS Triangulation Constrained Bundle Adjustment GIS 5 5
LOCALIZATION IN A KNOWN ENVIRONMENT Off-line On-line Camera pose 6
LOCALIZATION WITH A DATABASE 1) DATABASE CONSTRUCTION Two step process: VSLAM constrained to GPS/DEM Inacurracies due to the gps bias Refine the reconstruction with a global bundle adjsutment constrained to Buildings/DEM 7 7
LOCALIZATION WITH A DATABASE 2) ON-LINE LOCALIZATION Database 3D/3D correspondences Mapping Thread Matching with database Local Constrained Bundle Adjustment Relocalization Thread 3D point cloud 2D/3D correspondences + camera poses Triangulation 3D point cloud New image Interest point detection Matching with Slam Pose computation Tracking Thread 8
LOCALIZATION FOR AIDED NAVIGATION 9 9
AUGMENTED REALITY 10 10
IMPROVING VSLAM / GPS FUSION IN URBAN ENVIRONMENT Exploit Building models (GIS) Estimate the local bias of the GPS data Use corrected GPS data in VSLAM/GPS fusion (constrained bundle adjustment) GPS data with bias GPS data without bias 11 Soutenance de thèse LARNAOUT Dorra 11
IMPROVING VSLAM / GPS FUSION IN URBAN ENVIRONMENT 12
IMPROVING VSLAM / GPS FUSION IN URBAN ENVIRONMENT 13
LOCALIZATION GEOLOCALIZATION TECHNOLOGY PERFORMANCE Incremental motion (SLAM) + Constraint CAD Model tracking Absolute positioning by viewpoints recognition Sensor fusion (2D/3D camera, GPS, IMU, Odometer, Lidar) Accuracy <1% error in translation Execution runtime Challenge KITTI (SLAM) 2x faster for equal quality Low energy 3h on a Microsoft Surface Pro Multi-platform Win/Linux/Android/MacOS X86/ARM 14
ONGOING DEVELOPMENTS Improve buildings/3d point cloud association Building occluding by tree or car parked => wrong associations Scene parsing with deep convolutional networks 15 15
OBJECT DETECTION FOR SITUATION UNDERSTANDING 16
DEEP MANTA A vision based CEA-LIST Technology for detection and 2D/3D analysis of vehicles from a monocular image A MANy-TAsk approach based on Deep neural networks to simultaneously perform: (1) Vehicle detection, (2) 3D dimension estimation, (3) Orientation estimation, (4) 3D localization, (5) Part localization, (6) Part visibility characterization. 17
DEEP MANTA FEATURES Vehicle detection / Orientation estimation / 3D localization 2D bounding box Orientation / 3D position 3D box / distance to the camera 18
DEEP MANTA FEATURES Vehicle part localization (ex: high lights, mirrors, ) 19
DEEP MANTA FEATURES Detection of occluded vehicles/ Visibility characterization 20
DEEP MANTA FEATURES 3D template recognition (3D dimensions) 21
STATE OF THE ART STATE OF THE ART: OBJECT RECOGNITION Deep Convolutionnal Neural Networks (CNN) Classification Object detection Fine-grained object recognition Semantic segmentation Object segmentation Success of object proposals for object detection Less computing time / memory Region of Interest to classify Selective search 2013, MCG 2014, BING 2014, Edge Boxes 2014, 22
OBJECT PROPOSALS STATE OF THE ART : OBJECT PROPOSAL Selective Search for Object Recognition, IJCV 2013 3D Object Proposals for Accurate Object Class Detection, NIPS 2015 Monocular 3D Object Detection for Autonomous Driving, CVPR 2016 23
FASTER R-CNN Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, NIPS 2015 One neural network used for both object proposal and classification 24
DEEP MANTA APPROACH 25
SMART DATA ANNOTATION Required labels to learn the Deep Manta network 2D bounding boxes 2D part coordinates (visible / hidden) Part visibility Vehicle 3D dimensions How can we annotate these data in a short time? Smart annotation process using 3D models to annotate Kitti images Automatic labellisation 26
PERFORMANCE: KITTI BENCHMARK International benchmark for traffic scene analysis (autonomous driving) Participation of CEA-LIST (01/2017) Task : vehicle detection (127 competitors) Task : vehicle orientation estimation(64 competitors) Competitors : NVIDIA, Stanford University, Baidu, NEC Laboratories, UCSD, University of Toronto Deep MANTA approach Dataset: 7481 training images, 7518 test images Data augmentation : + Flipping images Deep tested architectures: GoogLeNet, VGG16 End-to-end learning (~ 2 day for training) 27
PERFORMANCE: KITTI BENCHMARK OBJECT DETECTION 127 participants -0,51% +6,37% 4th / 107 in 08/2016 28
PERFORMANCE: KITTI BENCHMARK ORIENTATION ESTIMATION 1st since 08/2016 64 participants 29
DEEP MANTA 3D LOCALIZATION PERFORMANCE 1 meter 2 meters Database Method Type Time (s) Easy Moderate Hard 3DOP Stereo 3.0 81.97 68.15 59.85 KITTI VAL1 Mono3D 4.2 48.31 38.98 34.25 Ours GoogleNet Mono 0.7 65.71 53.79 47.21 Ours VGG16 2.0 69.72 54.44 47.77 3DVP 40.0 45.61 34.28 27.72 KITTI VAL2 Ours GoogleNet Mono 0.7 70.90 58.05 49.00 Ours VGG16 2.0 66.88 53.17 44.40 Database Method Type Time (s) Easy Moderate Hard 3DOP Stereo 3.0 91.46 81.63 72.97 KITTI VAL1 Mono3D 4.2 74.77 60.91 54.24 Ours GoogleNet Mono 0.7 89.29 75.92 67.28 Ours VGG16 2.0 91.01 76.38 67.77 KITTI VAL2 3DVP 40.0 65.73 54.60 45.62 Ours GoogleNet Mono 0.7 90.12 77.02 66.09 Ours VGG16 2.0 88.32 74.31 63.62 30
VEHICLE DETECTION 31
FINE-GRAINED RECOGNITION: MAKE AND MODEL 32
MULTI-CLASS OBJECT DETECTION 33
PERSPECTIVES GENERALIZATION SMART DATA GENERATION ENHANCED FEATURES EMBEDDED COMPUTING 34
THANK YOU FOR YOUR ATTENTION quoc-cuong.pham@cea.fr Commissariat à l énergie atomique et aux énergies alternatives Institut List CEA SACLAY NANO-INNOV BAT. 861 PC142 91191 Gif-sur-Yvette Cedex - FRANCE www-list.cea.fr Établissement public à caractère industriel et commercial RCS Paris B 775 685 019