S7348: Deep Learning in Ford's Autonomous Vehicles. Bryan Goodman Argo AI 9 May 2017

Size: px

Start display at page:

Download "S7348: Deep Learning in Ford's Autonomous Vehicles. Bryan Goodman Argo AI 9 May 2017"

Tyler Wood
6 years ago
Views:

1 S7348: Deep Learning in Ford's Autonomous Vehicles Bryan Goodman Argo AI 9 May

2 Ford s 12 Year History in Autonomous Driving Today: examples from Stereo image processing Object detection Using RNN s Motorsports 2

3 Stereo Matching Problem Determining the correspondences in stereo images Calculating the disparities But what is the correct correspondence? Basic stereo matching algorithm Compare pixels on the same epipolar line in two images Choose the best match 3

4 Deep neural networks for stereo matching The brain can estimate the distance of an object using the visual information from two eyes. We can use deep neural networks Left Stereo Camera Deep Convolutional Neural Networks Post-Processing Distance Map Estimation Right Stereo Camera 4

5 Proposed deep convolutional neural network AV driving requires an intelligent distance map estimation, which filters out the objects not of interest. Network I General network Encoding and decoding layers Retain objects of interest in the training data sets Encoder Decoder Conv1 Conv2 Conv3 Conv4 Conv5 Deconv6 Conv6 Deconv7 Conv7 Deconv8 Conv8 Conv9 Deconv9 Conv10 Deconv10 Loss Function 5

weights in the encoding layers are shared Encoder Decoder Conv1L Conv2L Conv3L Conv4L Loss Function

6 Proposed deep convolutional neural network II Specialized network Encoding and decoding layers The cross correlation layers force the network to look for correspondence on the epipolar line The weights in the encoding layers are shared Encoder Decoder Conv1L Conv2L Conv3L Conv4L Loss Function Conv9 Deconv9 Conv8 Deconv8 Conv7 CC7 Deconv7 Conv6 CC6 Deconv6 Conv5 CC5 Conv1R Conv2R Conv3R Conv4R 6

Proposed deep convolutional neural network Cross correlation (CC) layer Computes CC values between each pairs of patches Outputs the CC values for each pair of patches Does not lose any information

7 Proposed deep convolutional neural network Cross correlation (CC) layer Computes CC values between each pairs of patches Outputs the CC values for each pair of patches Does not lose any information Loss function In AV driving, closer objects are more important than distant ones Assigns more weight to the closer objects The closer object distance is estimated more accurately α d 7

8 Performance on synthetic and real stereo data Synthetic data generation Generate 14,000 pairs of RGB stereo images Synthetic distance maps are only generated for the objects of interest, e.g. cars or pedestrians Gaussian noise added to the stereo images 8

9 Performance on synthetic and real stereo data Fine tuning with LIDAR data sets Project LIDAR point clouds onto the camera images The baseline and optic axes are not the same as the synthetic data Left camera Right camera Network I Network II 9

10 1/2x 10

11 Comparing Manual Annotation to DNN Model 11

12 Detection Result Original Image Enhanced Contrast Network s detection outperforms human labeler in low-contrast areas Pedestrian detection Pedestrian misdetection Detected, but not labeled 12

Introducing Recurrence in Detection and Tracking Use RNN s to detect occluded objects Remember location of static objects Predict location

13 Introducing Recurrence in Detection and Tracking Use RNN s to detect occluded objects Remember location of static objects Predict location of non-static objects Detector Detector Detector RNN Conv RNN Conv RNN Conv Feature Map Feature Map Feature Map Image 0 Image 1 Image 2 13

14 Orange = ground truth; Green = model prediction 14

15 Classifying NASCAR images The Ford team reviews pictures during the race 15

16 Classifying NASCAR images Gap Looking for damage and other performance indicators 16

17 Results Boxing the Cars Using ~2k images labeled with boxes around the vehicles, the model does well detecting cars 17

18 Results Boxing the Cars 18

19 Classifying NASCAR images Next determine car number: labeled ~30k images

20 Classifying NASCAR images Outliers easy to find in review

21 Classifying NASCAR images Human:??? Model: 78 Confidence: 0.999

22 Classifying NASCAR images Human:??? Model: 42 Confidence: 0.985

23 Inspecting the Neural Network Activated Filter Input Image The Model is not a black box. We can see that it is detecting the numbers important for robustness when the paint changes 23

Argo AI Argo AI is an artificial intelligence company, established to tackle one of the most challenging applications in computer science, robotics and artificial intelligence: self-driving

24 Argo AI Argo AI is an artificial intelligence company, established to tackle one of the most challenging applications in computer science, robotics and artificial intelligence: self-driving vehicles Engineering hubs in Pittsburgh, Southeastern Michigan and the Bay Area of California For more information regarding Argo AI and its work, please talk to me at GTC or visit: 24

JOINT DETECTION AND SEGMENTATION WITH DEEP HIERARCHICAL NETWORKS. Zhao Chen Machine Learning Intern, NVIDIA

JOINT DETECTION AND SEGMENTATION WITH DEEP HIERARCHICAL NETWORKS Zhao Chen Machine Learning Intern, NVIDIA ABOUT ME 5th year PhD student in physics @ Stanford by day, deep learning computer vision scientist