Training models for road scene understanding with automated ground truth Dan Levi

Size: px

Start display at page:

Download "Training models for road scene understanding with automated ground truth Dan Levi"

Reynard Hart
6 years ago
Views:

1 Training models for road scene understanding with automated ground truth Dan Levi With: Noa Garnett, Ethan Fetaya, Shai Silberstein, Rafi Cohen, Shaul Oron, Uri Verner, Ariel Ayash, Kobi Horn, Vlad Golder, Tomer Peer

2 GM Advanced Technical Center in Israel (ATCI)

3 Agenda Road scene understanding Acquiring training data with automated ground truth (AGT) Test cases: General obstacle detection Road segmentation General obstacle classification Curb detection Freespace Challenges and limitations Summary and future work

clutter, construction zone cones Dynamic: Classified objects

4 On-board road scene understanding Static: Road edge Road markings, complex lane understanding Signs Obstacles: clutter, construction zone cones Dynamic: Classified objects (cars, pedestrians, bicycles, animals ) General obstacles: animals, carts

5 Obstacle detection: general and category based

6 General obstacles, freespace, road segmentation - Road segmentation (vision) Mono-camera using semantic segmentation - General obstacle detection - Freespace (all non-flat road delimiters) 3D sensors (Stereo, Lidar)

7 Training Data Revisiting Unreasonable Effectiveness of Data in Deep Learning Era [Sun et al. 2017]

8 Manual Annotation The Cityscapes Dataset for Semantic Urban Scene Understanding [Cordts et al. 2016] Time: ~60 min per image ~1000 annotators

9 Computer graphics simulated data Photo-realism Scenario generation

10 Simulated data for optical flow FlowNet: Learning Optical Flow with Convolutional Networks [Dosovitskiy et. al. 2015]

11 Automated ground truth(agt) / Cross-sensor learning Velodyne LIDAR

12 Depth from a single image Semi-Supervised Deep Learning for Monocular Depth Map Prediction [Kuznietsov et al. 2017]

13 Map supervised road detection Map-supervised road detection [Laddha et al. 2016]

14 AGT for road scene understanding general setup Supervising sensors Target sensors Perquisite: Full alignment and synchronization between sensors

15 AGT for road scene understanding: scheme Supervising sensors: Target sensors: Task: object detection Data AGT: 1. Compute Task on Supervising sensors: - Offline - Temporal AGT: 2. Project output to target sensor domain Ground truth

16 Automated ground truth / Cross-sensor learning 1. Solve an easier problem - Run time - Completeness 2. Promise - Scalability - Continuous (un-bounded) improvement 1. Challenging setup 2. Annotation quality / accuracy 3. Inherent limitations of supervisor : - Learning beyond supervisor capabilities - Learning from the same sensor (bootstrapping)

17 Supervising sensors: Velodyne HDL64 AGT for General obstacle detection Target sensors: Front camera Task: General obs. Det. Data AGT Ground truth

18 StixelNet: Monocular obstacle detection Levi, Dan, Noa Garnett, Ethan Fetaya. StixelNet : A Deep Convolutional Network for Obstacle Detection and Road Segmentation. In BMVC 2015.

19 StixelNet column based approach INPUT OUTPUT

20 AGT for obstacle detection version I KITTI Dataset [Geiger et al. 2013]

21 AGT for obstacle detection version I Velodyne LIDAR Raw images: 56 sequences (50 Train, 6 Test). 6,000 train images (every 5 th frame) and 800 test. Ground truth result: After GT: 331K training columns and 57K testing

22 AGT for general obstacles in image plane 1. Project Lidar points to image plane 2. Interpolate depth to all pixels 3. Find columns with depth profile typical to transition: road roughly vertical obstacle Figure taken from [Fernandes et al. 2015]

23 General obstacles AGT examples Limitations: - Cannot handle: close obstacles, clear columns - Low coverage (~30%)

24 StixelNet (v1) 5 Layer CNN Layer 1: convolution al Layer 2: convolution al Layer 3: fully connected Layer 4: fully connected Layer 5: fully connected 5 1 INPUT Max poolin g (8X4) Max poolin g (4X3) Dense Dense Dense OUTPUT y

26 Experimental Results (Max Probability)

27 Comparison with stereo Stereo [Badino et al. 2009] StixelNet

28 Supervising sensors: AGT for Obstacle classification Target sensors: Front camera Task: Obstacle classification Data AGT Ground truth

29 AGT for obstacle classification Image based detection Lidar based verification Source:

30 Obstacle classification trained net result: pedestrians

31 AGT for General obstacle detection (ver 2) Supervising sensors: Velodyne HDL64 Target sensors: Front camera Task: General obs. Det. Data AGT Ground truth

32 Unified network: StixelNet + Object detection + Object pose estimation Noa Garnett, Shai Silberstein, Shaul Oron, Ethan Fetaya, Uri Verner, Ariel Ayash, Vlad Goldner, Rafi Cohen, Kobi Horn, Dan Levi. Real-time category-based and general obstacle detection for autonomous driving. CVRSUAD Workshop, ICCV2017.

33 Object-centric obstacle detection AGT Estimate and subtract road plane 3D Clustering (objects above 20cm) Detect clear columns: No object above 5cm + far enough returns near obstacles: 1. Below lidar coverage 2. During training Project to image, smooth Bottom contour via dynamic programming

34 General obstacles: old vs. new AGT

35 New general obstacle dataset with fisheye lens camera #images Kitti--train 6K 5M Internaltrain 16K #instances (columns) 20M Kitti-test K Internal-test K

36 StixelNet2: New network architecture

37 Improved results on KITTI

38 Experimental results with new AGT Old New 0 Kitti - max Pr. Internal - max Pr. kitti - avg. Pr Internal - avg. Pr

39 Experimental results with new AGT Edge cases excluded ( near, clear ) Chart Old New Title Kitti - max Pr. Internal - max Pr. kitti - avg. Pr Internal - avg. Pr

40 Cross dataset generalization kitti-test max kitti-test avg internal-test max internal-test avg Train on KITTI Train on internal Train on both

41 Supervising sensors: AGT for car pose estimation Target sensors: Task: pose estimation IMU Data AGT Ground truth

42 AGT for pose estimation Multi sensor, temporal object detection 8 orientation bins pose representation Source: Dynamic Static

43 Pose estimation trained with mixed AGT and Manual

44 AGT for Curb detection

45 Curb detection trained net result examples

46 Supervising sensors: AGT for freespace Target sensors: Task: freespace Data AGT Ground truth

47 AGT for freespace with 3D beams Estimate and subtract road plane Analyze single Lidar Beam Velodyne Project limit to ground plane Velodyn e scan direction Project freespace limit to image plane, find near and clear

48 Obstacles vs. Freespace AGT

49 Freespace + object detection + car 3D pose

50 Freespace + object detection + car 3D pose

51 Freespace + object detection + car 3D pose

52 Freespace + object detection + car 3D pose

53 Freespace + object detection + car 3D pose

54 Finetuning from AGT: road segmentation 1. Fine-tune on KITTI Road segmentation (manually labelled) 2. Graph-cut segmentation 3. State-of-the-art accuracy among non-anonymous (94.88% MaxF)

55 AGT challenges: How accurate is the AGT?

56 AGT challenges: calibration, synchronization

57 AGT Perception mistakes Non-flat road Assumptions / coverage

58 Thank you!

Training models for road scene understanding with automated ground truth Dan Levi

Training models for road scene understanding with automated ground truth Dan Levi With: Noa Garnett, Ethan Fetaya, Shai Silberstein, Rafi Cohen, Shaul Oron, Uri Verner, Ariel Ayash, Kobi Horn, Vlad Golder