AutoCalib: Automatic Calibration of Traffic Cameras at Scale

Size: px

Start display at page:

Download "AutoCalib: Automatic Calibration of Traffic Cameras at Scale"

Amberly Stevens
6 years ago
Views:

1 AutoCalib: Automatic of Traffic Cameras at Scale Romil Bhardwaj, Gopi Krishna Tummala*, Ganesan Ramalingam, Ramachandran Ramjee, Prasun Sinha* Microsoft Research, *The Ohio State University

2 Number of Cameras (Million) Number of Security Cameras Worldwide Source: IHS

3 Conventional Traffic Camera Uses Manual Surveillance Post-facto Incident Review

4 Emerging Traffic Camera Use Cases Vehicle Speed Measurement (without dedicated sensors) Traffic Analytics Near Miss Stats All require distance measurements in the scene

5 Measuring Distances in an Image 220 px = 34 m Camera Real-world Coordinates (m) <-> Image Coordinates (px) 220 px = 8 m

Camera y = f x 0 c x 0 f y c y 0 0 1 r 11 r 12 r 13 t 1 r 21 r 22 r 23 t 2 x r 31 r 32 r 33 t 3 Image Coordinates

6 Camera y = f x 0 c x 0 f y c y r 11 r 12 r 13 t 1 r 21 r 22 r 23 t 2 x r 31 r 32 r 33 t 3 Image Coordinates Intrinsic Matrix (Focal length, camera center) Extrinsic Matrix (Rotation, Translation) Real World Coordinates R T

7 Hard y = f x 0 c x 0 f y c y r 11 r 12 r 13 t 1 r 21 r 22 r 23 t 2 x r 31 r 32 r 33 t 3 Image Coordinates Intrinsic Matrix (Focal length, camera center) Extrinsic Matrix (Rotation, Translation) Real World Coordinates Not Scalable!

Soft f x 0 c x y = 0 f y c y 0 0 1 EPnP Solver r 11 r 12 r 13 t 1 r 21 r 22 r 23 t 2 x r 31 r 32 r 33 t 3 Image

8 Soft f x 0 c x y = 0 f y c y EPnP Solver r 11 r 12 r 13 t 1 r 21 r 22 r 23 t 2 x r 31 r 32 r 33 t 3 Image Coordinates Intrinsic Matrix (Focal length, camera center) Extrinsic Matrix (Rotation, Translation) Real World Coordinates

9 Soft - Prior Art Chessboard No Chessboard Patterns in Traffic Views Vanishing Points Assumption of Straight Line Motion Geometric Landmarks Assumption of Landmarks

10 AutoCalib Overview AutoCalib R T Traffic Video Estimate AutoCalib: no humans-in-the-loop, robust calibration

11 AutoCalib - Pipeline Cropped Image Vehicle Keypoints Vehicle Geometric Dimensions Video Frames Vehicle Detection Keypoint Extraction Y C Z C X C Y G Z G R, T X G R 1, T 1 R 2, T 2 : Values Geometry based filters s Set

12 Vehicle Detection Video Frames Vehicle Detection Keypoint Extraction s Set Geometry based filters Values

13 Vehicle Detection Off-the-shelf DNNs (Fast-RCNN, YOLO) promise state of the art accuracy Expensive, scene often empty Background Subtraction is fast Inaccurate Solution - Trigger the DNN with Background Subtraction Video Frames Vehicle Detection Keypoint Extraction s Set Geometry based filters Values

14 Key-point Extraction Video Frames Vehicle Detection Keypoint Extraction s Set Geometry based filters Values

15 Key-point Selection Desired Properties 1. Visually Distinct Ease of detection 2. Non-planar Robust s vs Video Frames Vehicle Detection Keypoint Extraction s Set Geometry based filters Values

16 Key-point Extraction Statistical vision based techniques aren t robust to lighting variations DNNs require a lot of labelled data No datasets available Transfer learn a DNN on a smaller dataset Video Frames Vehicle Detection Keypoint Extraction s Set Geometry based filters Values

17 Transfer Learning - Primer Output: BMW 3 Series Convolution and Pooling Layers (Generic Features) Fully Connected Layers (Car Model Classification)

18 Transfer Learning - Primer Output: Key-points (x,y) Transfer Learning - Less Data, Faster Training Convolution and Pooling Layers (Generic Features) Fully Connected Layers (now detecting key-points)

19 Key-point DNN Dataset Manually labelled key-points on 486 car images Image Augmentation Original Crop Original Rotate Original Img Horz Mirror Horz Mirror Rotate Horz Mirror Crop Total of 10,344 images post augmentation Video Frames Vehicle Detection Keypoint Extraction s Set Geometry based filters Values

Key-point DNN Training GoogLeNet architecture

outputs Video Frames Vehicle Detection Keypoint

20 Key-point DNN Training GoogLeNet architecture trained on CUHK CompCars dataset (CVPR 15) for Car make/model classification Replaced last two fully connected layers with keypoint regression outputs Video Frames Vehicle Detection Keypoint Extraction s Set Geometry based filters Values

21 Key-point DNN Performance ~80% of Key-points < 10% error

Estimation y = f x 0 c x 0 f y c y 0 0 1 r 11 r 12 r 13 t 1 r 21 r 22 r 23 t 2

center) Extrinsic Matrix (Rotation, Translation) Real World Coordinates Video

22 Estimation y = f x 0 c x 0 f y c y r 11 r 12 r 13 t 1 r 21 r 22 r 23 t 2 x r 31 r 32 r 33 t 3 Image Coordinates Intrinsic Matrix (Focal length, camera center) Extrinsic Matrix (Rotation, Translation) Real World Coordinates Video Frames Vehicle Detection Keypoint Extraction s Set Geometry based filters Values

23 Vehicle Identification at low resolution is hard! (for both, humans and machines) Video Frames Vehicle Detection Keypoint Extraction s Set Geometry based filters Values

24 Can t identify so, approximate! Calibrate with most popular cars (Toyota Prius, Toyota Corolla, Honda Civic, Volkswagen Jetta, BMW 320i, Audi A4, etc.) Calibrate R 1, T 1 R 2, T 2 n Models n s R 3, T 3 Video Frames Vehicle Detection Keypoint Extraction s Set Geometry based filters Values

25 Errors in Statistical filters to remove outliers and average Model Approximation Errors Key-point Prediction Errors Video Frames Vehicle Detection Keypoint Extraction s Set Geometry based filters Values

26 Key Insight 1 Ground plane should be consistent across all s Video Frames Vehicle Detection Keypoint Extraction s Set Geometry based filters Values

27 The Orientation Filter 1. For calibration R i, T i, its Z-axis orientation Ԧz i is defined by vector R,3 R 1, T 1 i 2. Let Ԧz avg = Average(R,3 ) 3. Pick n% calibrations with the least deviation R 2, T 2 between Ԧz and Ԧz avg Video Frames Vehicle Detection Keypoint Extraction s Set Geometry based filters Values

28 Key Insight 2 d p Distance to a fixed point must be consistent across s Video Frames Vehicle Detection Keypoint Extraction s Set Geometry based filters Values

29 The Displacement Filter Focus region: Region where cars are detected For each : 1. Point p i = projection of center of focus region on the ground plane using (R i, T i ) 2. d i = Distance of p i to camera Pick middle n% and filter the rest p i d i Video Frames Vehicle Detection Keypoint Extraction s Set Geometry based filters Values

30 Filtering Overview (R 1, T 1 ) (R 2, T 2 ) (R 3, T 3 ).. Orientation Filter (75%) Displacement Filter (50%) Orientation Filter (75%) Average Rotation Matrix (R avg, T 1 ) (R avg, T 2 ).. Displacement Filter (Pick median) (R final, T final ) Video Frames Vehicle Detection Keypoint Extraction s Set Geometry based filters Values

31 Implementation Azure Service 4 Tesla K80s, 224 GB RAM < 12% error with ~8 minutes of video

Camera Image Ground truth distances and calibration

32 Evaluation - Dataset 350+ hours from 10 traffic cameras in Seattle F A B D E G Resolution - 640x360 to 1280x720 Camera Image Ground truth distances and calibration estimated using Google Earth 9m F A B 8m 8m D E 12m G Google Earth View

33 Evaluation

34 RMS Error (%) AutoCalib vs Manual 20 Ground Distance Measurement, RMS Error (%) C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 AutoCalib achieves <12% RMS error in measuring distances Manual AutoCalib Estimate

35 RMS Error (%) AutoCalib vs Prior Art 60 Ground Distance Measurement, RMS Error (%) C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 AutoCalib outperforms prior state of the art approaches AutoCalib VP Approach [1] [1] Dubská et al., Fully automatic Roadside Camera for Traffic Surveillance. IEEE ITS 2015

36 Does more video data help? AutoCalib converges with increasing vehicle detections

37 Application Speed Measurement

calibration Uses DNNs to analyze vehicle geometry

38 AutoCalib Summary Camera Enables distance measurements Highly manual today AutoCalib Scalable automatic calibration Uses DNNs to analyze vehicle geometry Experiments < 12% error in measuring distances Calibrates with few hundred detections

AutoCalib: Automatic Traffic Camera Calibration at Scale

AutoCalib: Automatic Traffic Camera Calibration at Scale Romil Bhardwaj Microsoft Research Gopi Krishna Tummala* The Ohio State University Ganesan Ramalingam Microsoft Research ABSTRACT Ramachandran Ramjee