Counting Passenger Vehicles from Satellite Imagery

Counting Passenger Vehicles from Satellite Imagery Not everything that can be counted counts, and not everything that counts can be counted NVIDIA GPU Technology Conference 02 Nov 2017 KEVIN GREEN MACHINE LEARNING AND COMPUTER VISION SCIENTIST MAHMOUD LABABIDI SENIOR DATA SCIENTIST

There Are Three Kinds of People - Those Who Can Count and Those Who Can't 2

Agenda Introduction to Satellite Imagery Why Machine Learning for Satellite Imagery? Car Counting from Space is Hard Problems Solutions Algorithm Overview Classification Segmentation Object Detection Production Future Work Car Count = 14 Cars 3

DigitalGlobe s Satellite Constellation 4

Why Machine Learning for Satellite Imagery? Extracting information from imagery in an automated way allows for timely, low-manpower macroscopic as well as needle-in-haystack exploitations (e.g., Intelligence, Business Analytics, Disaster Relief) Some challenges exist in viewing satellite imagery: Environmental (lighting) and atmospheric (clouds) distortion Pixel count for small objects (cars) in 30 cm resolution imagery is small (~4-5 pixel width) Automation exists for basic techniques(e.g., vegetation and land use classification), however, for complex tasks such as object detection, these techniques fail The capability of transforming overhead imagery content into countable objects will satisfy the analytic needs of commercial and military customers 5

Car Counting from Space is Hard Problems Counting thousands of cars from 30 cm (1 foot) resolution Satellite imagery is visually difficult (i.e., cars are roughly 5 pixels width), tedious, and laborious Various brightness levels of cars, some cars are dark, blending into other dark surfaces Classical computer vision techniques such as edge detection fail in this arena Our Solutions Object Classification Segmentation + Morphological operations Object Bounding Box Detection 6

Algorithm Overview Classifier LeNet Non-maximum Suppresion Removes overlapping detection boxes Input Imagery WorldView 3 Segmentation Fully Convolutional Network (FCN) Morphology Binary Threshold Convex Hull Opening Labeled Cars and Count Estimated Car Count Object Detector Single Shot Detector (SSD) Non-maximum Suppresion Removes overlapping detection boxes 7

Algorithm #1: The LeNet Classifier Yann LeCun started the Convolutional technique to perform classification, particularly for Optical Character Recognition. Their 1998 paper, Gradient-Based Learning Applied to Document Recognition has over 9,600 citations. 8

Sliding Window The LeNet Classifier: Sliding Windows and NMS Trained Classifiers slide a window over an image and placing a bounding box (BBOX) on the portions of the image that contain the object of interest Non-maximum suppression (NMS) is an algorithm that eliminates overlapping detection boxes that are produced by sliding a window over an image Non-maximum Suppression 10

A) The LeNet Classifier Results Several issues arise when using an image classifier to detect cars Determining how much percent overlap to allow before suppressing overlapping detection boxes More prone to false hits (see figure B) Challenges in determining car count if BBOX is too large B) 11

Algorithm #2 Segmentation: Fully Convolution Network Based on the VGG Neural Network 12

Segmentation Results A) Here are the issues that surface when using a segmentation algorithm to detect cars The segmentation outputs exceptional results through pixel masks containing cars but lacks individual BBOXs Fortunately, segmentation has few false positives to Classification (and Object Detection in some cases) Although segmentation is powerful, in order to extract individual cars, additional post-processing is needed (e.g., morphology) B) 13

Morphological Operations used to Extract Cars from Segments 14

Image Erosion Morphological Operations (Opening/Closing/Convex) Image Dilation Morphological operations modify the shape of an image in diverse ways: Erosion Erodes the boundary of the image Dilation Expands the boundary Opening - Erosion followed by dilation (noise removal) Closing Dilation followed by erosion (hole filler) Convex - Shape formed by a rubber band stretched around foreground image Opening Convex Hull Closing 15

Car (RGB) Car (Binary Threshold) Does this still look like a Car? 16

A) Morphology to Extract Cars Thresholding followed by morphological operations may not always yield one car blob: Figure A illustrates two car components after binary thresholding Figure B does show one car component per car blob after applying an opening operation However, is the resultant car blob really a car? One final step is validate the car geometry B) 17

Spatial False Alarm Mitigators (FAMs) Reduce False Detections Area Spatial FAMs using oriented bounding boxes were used to eliminate blobs that didn t meet the average dimensions criteria for a car: Area Eliminate too big or too small Length/Width Ratio Eliminate long skinny boxes Length and Width Eliminate if pixel size is one 18

Algorithm #2: Segmentation Conclusion: Segmentation (localization) shows promising results when combined with morphological operations (refinement), enabling us to quickly calculate accurate car counts in satellite imagery Car Count = 14 Cars 19

Algorithm #3: Single Shot Detector (SSD) 20

A) The SSD Object Detector Results The Single Shot Detector does a more direct execution at detecting and drawing BBOXs around cars Locates individual cars more accurately in densely packed parking lots Also less prone to false hits (see figure B) NMS issues may arise if overlap area isn t calibrated B) 21

Competitive Car-Counting Bake-off Other Commercial Vendor = 2,205 22

Training (wheels) to Production GBDX is the platform that uses Amazon EC2 to deploy docker images of the code and model. NVIDIA GPU Training: NVIDIA GTX 980 NVIDIA GTX 1080 NVIDIA TitanX NVIDIA M40 NVIDIA P100 NVIDIA M1000M - Mobile Machine Learning Frameworks: TensorFlow Keras Caffe Training Speed: 100 Batches of 4 300x300 images take 20 minutes to train Inference Speed: 20 minutes on strip shown above approximately 13kx13k 23

AnswerFactory SSD Workflow 1) Define AOI & Select Detect Model (Cars) 2) Select Date Ranges & Auto Update Historical (15 years) Run on all new images in the future AnswerFactory 3) Run Model & Get Results 4) Analyze Individual Parking Lots Over-time Employee Parking Resident Parking VIP & Visitor Parking 24

Provide analysts Tips on changing activity levels for enhanced garrison monitoring 16APR2015 29AUG2015 350 Military Vehicle Counting 300 250 200 150 100 Object Count Actual Count 50 0 16APR2015 29AUG2015 2SEP2015 2SEP2015 Significant Changes Detected 25

Future Work Explore the use of different multi-spectral band combinations for improved car count Explore whether different activations might better support detecting dark cars (e.g., Leaky Relu) Go beyond temporal volume anomalies to include spatial anomalous behaviour Upcoming Xview Challenge, which is an ImageNet-like challenge competition for Satellite imagery DigitalGlobe colleague, Tood Bacastow, discussed earlier today in his talk entitled SpaceNet: Accelerating Automated Mapping with Deep Learning and Labeled Satellite Imagery 26

Thank you! Questions? 27