VISION FOR AUTOMOTIVE DRIVING

Size: px

Start display at page:

Download "VISION FOR AUTOMOTIVE DRIVING"

Arlene Avice Poole
6 years ago
Views:

1 VISION FOR AUTOMOTIVE DRIVING French Japanese Workshop on Deep Learning & AI, Paris, October 25th, 2017 Quoc Cuong PHAM, PhD Vision and Content Engineering Lab

AI & MACHINE LEARNING FOR ADAS AND SELF-DRIVING CARS Sensor Fusion Trajectory Planning Localization Control Strategy Situation Understanding

2 AI & MACHINE LEARNING FOR ADAS AND SELF-DRIVING CARS Sensor Fusion Trajectory Planning Localization Control Strategy Situation Understanding Driver Model Supervised, unsupervised, reinforcement learning SLAM, pattern recognition, classification, clustering, segmentation, motion planning 2

3 LOCALIZATION IN LARGE ENVIRONMENTS 3

4 VISUAL SLAM Localization in an arbitray coordinate system Unknown scale in monocular SLAM Error accumulation (drift) New image Interest point detection Matching Pose computation No Key frames? Triangulation Bundle Adjustment 4 4

5 CONSTRAINED SLAM FRAMEWORK New image odometer Interest point detection Matching Pose computation No IMU Key frames? GPS Triangulation Constrained Bundle Adjustment GIS 5 5

6 LOCALIZATION IN A KNOWN ENVIRONMENT Off-line On-line Camera pose 6

7 LOCALIZATION WITH A DATABASE 1) DATABASE CONSTRUCTION Two step process: VSLAM constrained to GPS/DEM Inacurracies due to the gps bias Refine the reconstruction with a global bundle adjsutment constrained to Buildings/DEM 7 7

LOCALIZATION WITH A DATABASE 2) ON-LINE LOCALIZATION Database 3D/3D correspondences Mapping Thread Matching with database Local Constrained Bundle Adjustment Relocalization

8 LOCALIZATION WITH A DATABASE 2) ON-LINE LOCALIZATION Database 3D/3D correspondences Mapping Thread Matching with database Local Constrained Bundle Adjustment Relocalization Thread 3D point cloud 2D/3D correspondences + camera poses Triangulation 3D point cloud New image Interest point detection Matching with Slam Pose computation Tracking Thread 8

9 LOCALIZATION FOR AIDED NAVIGATION 9 9

10 AUGMENTED REALITY 10 10

11 IMPROVING VSLAM / GPS FUSION IN URBAN ENVIRONMENT Exploit Building models (GIS) Estimate the local bias of the GPS data Use corrected GPS data in VSLAM/GPS fusion (constrained bundle adjustment) GPS data with bias GPS data without bias 11 Soutenance de thèse LARNAOUT Dorra 11

12 IMPROVING VSLAM / GPS FUSION IN URBAN ENVIRONMENT 12

13 IMPROVING VSLAM / GPS FUSION IN URBAN ENVIRONMENT 13

CAD Model tracking Absolute positioning by

camera, GPS, IMU, Odometer, Lidar) Accuracy <1%

KITTI (SLAM) 2x faster for equal quality Low

14 LOCALIZATION GEOLOCALIZATION TECHNOLOGY PERFORMANCE Incremental motion (SLAM) + Constraint CAD Model tracking Absolute positioning by viewpoints recognition Sensor fusion (2D/3D camera, GPS, IMU, Odometer, Lidar) Accuracy <1% error in translation Execution runtime Challenge KITTI (SLAM) 2x faster for equal quality Low energy 3h on a Microsoft Surface Pro Multi-platform Win/Linux/Android/MacOS X86/ARM 14

15 ONGOING DEVELOPMENTS Improve buildings/3d point cloud association Building occluding by tree or car parked => wrong associations Scene parsing with deep convolutional networks 15 15

16 OBJECT DETECTION FOR SITUATION UNDERSTANDING 16

simultaneously perform: (1) Vehicle detection, (2) 3D dimension estimation, (3)

17 DEEP MANTA A vision based CEA-LIST Technology for detection and 2D/3D analysis of vehicles from a monocular image A MANy-TAsk approach based on Deep neural networks to simultaneously perform: (1) Vehicle detection, (2) 3D dimension estimation, (3) Orientation estimation, (4) 3D localization, (5) Part localization, (6) Part visibility characterization. 17

18 DEEP MANTA FEATURES Vehicle detection / Orientation estimation / 3D localization 2D bounding box Orientation / 3D position 3D box / distance to the camera 18

19 DEEP MANTA FEATURES Vehicle part localization (ex: high lights, mirrors, ) 19

20 DEEP MANTA FEATURES Detection of occluded vehicles/ Visibility characterization 20

21 DEEP MANTA FEATURES 3D template recognition (3D dimensions) 21

22 STATE OF THE ART STATE OF THE ART: OBJECT RECOGNITION Deep Convolutionnal Neural Networks (CNN) Classification Object detection Fine-grained object recognition Semantic segmentation Object segmentation Success of object proposals for object detection Less computing time / memory Region of Interest to classify Selective search 2013, MCG 2014, BING 2014, Edge Boxes 2014, 22

23 OBJECT PROPOSALS STATE OF THE ART : OBJECT PROPOSAL Selective Search for Object Recognition, IJCV D Object Proposals for Accurate Object Class Detection, NIPS 2015 Monocular 3D Object Detection for Autonomous Driving, CVPR

24 FASTER R-CNN Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, NIPS 2015 One neural network used for both object proposal and classification 24

25 DEEP MANTA APPROACH 25

26 SMART DATA ANNOTATION Required labels to learn the Deep Manta network 2D bounding boxes 2D part coordinates (visible / hidden) Part visibility Vehicle 3D dimensions How can we annotate these data in a short time? Smart annotation process using 3D models to annotate Kitti images Automatic labellisation 26

Stanford University, Baidu, NEC Laboratories, UCSD, University of Toronto Deep MANTA approach Dataset: 7481 training images, 7518

27 PERFORMANCE: KITTI BENCHMARK International benchmark for traffic scene analysis (autonomous driving) Participation of CEA-LIST (01/2017) Task : vehicle detection (127 competitors) Task : vehicle orientation estimation(64 competitors) Competitors : NVIDIA, Stanford University, Baidu, NEC Laboratories, UCSD, University of Toronto Deep MANTA approach Dataset: 7481 training images, 7518 test images Data augmentation : + Flipping images Deep tested architectures: GoogLeNet, VGG16 End-to-end learning (~ 2 day for training) 27

28 PERFORMANCE: KITTI BENCHMARK OBJECT DETECTION 127 participants -0,51% +6,37% 4th / 107 in 08/

29 PERFORMANCE: KITTI BENCHMARK ORIENTATION ESTIMATION 1st since 08/ participants 29

30 DEEP MANTA 3D LOCALIZATION PERFORMANCE 1 meter 2 meters Database Method Type Time (s) Easy Moderate Hard 3DOP Stereo KITTI VAL1 Mono3D Ours GoogleNet Mono Ours VGG DVP KITTI VAL2 Ours GoogleNet Mono Ours VGG Database Method Type Time (s) Easy Moderate Hard 3DOP Stereo KITTI VAL1 Mono3D Ours GoogleNet Mono Ours VGG KITTI VAL2 3DVP Ours GoogleNet Mono Ours VGG

31 VEHICLE DETECTION 31

32 FINE-GRAINED RECOGNITION: MAKE AND MODEL 32

33 MULTI-CLASS OBJECT DETECTION 33

34 PERSPECTIVES GENERALIZATION SMART DATA GENERATION ENHANCED FEATURES EMBEDDED COMPUTING 34

THANK YOU FOR YOUR ATTENTION quoc-cuong.pham@cea.

35 THANK YOU FOR YOUR ATTENTION Commissariat à l énergie atomique et aux énergies alternatives Institut List CEA SACLAY NANO-INNOV BAT. 861 PC Gif-sur-Yvette Cedex - FRANCE www-list.cea.fr Établissement public à caractère industriel et commercial RCS Paris B

CONTENT ENGINEERING & VISION LABORATORY. Régis Vinciguerra

CONTENT ENGINEERING & VISION LABORATORY Régis Vinciguerra regis.vinciguerra@cea.fr ALTERNATIVE ENERGIES AND ATOMIC ENERGY COMMISSION Military Applications Division (DAM) Nuclear Energy Division (DEN) Technological