Cluster-based 3D Reconstruction of Aerial Video

Cluster-based 3D Reconstruction of Aerial Video Scott Sawyer (scott.sawyer@ll.mit.edu) MIT Lincoln Laboratory HPEC 12 12 September 2012 This work is sponsored by the Assistant Secretary of Defense for Research and Engineering under Air Force contract FA8721-05-C-0002. Opinions, interpretations, conclusions, and recommendations are those of the author and are not necessarily endorsed by the United States Government.

Cluster SfM - 2 Given a collection of photos

Cluster SfM - 3 determine their 3D structure

determine their 3D structure Cluster SfM - 4

Outline Feature-based 3D Reconstruction Reconstructing Aerial Video Cluster-based Structure from Motion Results Cluster SfM - 5

Outline Feature-based 3D Reconstruction Reconstructing Aerial Video Cluster-based Structure from Motion Results Cluster SfM - 6

Structure from Motion (SfM) Photo Collection Feature Extraction Feature Matching Sparse Reconstruction Dense Reconstruction 3D Point Cloud Applications: 3D Maps Robotic Navigation Video Geo-registration Cluster SfM - 7

Feature Extraction Photo Collection Feature Extraction Feature Matching Sparse Reconstruction Dense Reconstruction 3D Point Cloud Scale Invariant Feature Transform (SIFT) Identifies distinctive keypoints in an image and provides 128-dimensional descriptor vector Based on finding local extrema in scale space Invariant to scale, orientation and lighting conditions (to an extent) and aspect angles up to 30 degrees Published by David Lowe in 1999 and widely used in computer vision Cluster SfM - 8

Feature Matching Photo Collection Feature Extraction Feature Matching Euclidean Distance Approximate Nearest Neighbors (ANN) Euclidean distance between SIFT keypoint descriptors used to determine matches Tree-based approximation reduces computation SIFT points matched between each pair of images O(n2) Cluster SfM - 9 Sparse Reconstruction Dense Reconstruction 3D Point Cloud

Sparse Reconstruction Photo Collection Feature Extraction Feature Matching Sparse Reconstruction Dense Reconstruction 3D Point Cloud Projective Structure from Motion X j Given m images and n 3D points (with point correspondences) Estimate m projection matrices (P i, i = 1,, m) and n 3D points (X j, j = 1,, n) from the mn correspondences (x ij ) x 1j x 3j Iteratively minimize re-projection error ( bundle adjustment ): m E( P, X) = D n i= 1 j= 1 ( ) 2 x, P X ij i j P 1 x 2j P 2 P 3 Cluster SfM - 10

Dense Reconstruction Photo Collection Feature Extraction Feature Matching Sparse Reconstruction Dense Reconstruction 3D Point Cloud Identify Patches of Rigid Structure Mature software ( PMVS2 ) forms patched-based reconstruction (open source, parallel) Removes spurious and nonrigid points This step not included in benchmark analysis Cluster SfM - 11

Outline Feature-based 3D Reconstruction Reconstructing Aerial Video Cluster-based Structure from Motion Results Cluster SfM - 12

Reconstructing Aerial Video Lincoln Laboratory performed a campaign of aerial data collections in 2011: Aerial platform circles ground subject of interest Camera gimbal remains fixed on ground point Video frames synchronized to GPS Challenge: Processing must keep up with rate of collection (e.g. one-hour mission timeline) Cluster SfM - 13

Cluster SfM - 14 Sample Video

Bundler Runtime (hours) 60 40 20 0 Bundler (single-core) 100 200 300 400 500 Image Set Size Works on unordered image collections (no assumptions about input) Sequential implementation (single-core) Optimization solution considered baseline (truth) Feat. Extr. Feat. Match Bundle Adj. Bundler (v. 0.4) by Noah Snavely Cluster SfM - 15

VisualSFM VisualSFM (multi-core & GPU) Runtime (hours) 30 20 10 0 100 200 300 400 500 Image Set Size Feat. Extr. Feat. Match Bundle Adj. Restructures optimization to better enable parallelization of dense matrix kernels Feature extraction and bundle adjustment use GPU acceleration (CUDA) Multi-threaded feature matching (pthreads) Run on quad-core Intel Xeon E5620 (2.4 GHz), 6 GB RAM, nvidia Quadro FX1800 (64 compute cores) VisualSFM (v. 0.5.15) by Changchang Wu Cluster SfM - 16

Outline Feature-based 3D Reconstruction Reconstructing Aerial Video Cluster-based Structure from Motion Results Cluster SfM - 17

LLGrid Computing Environment Job submission Scheduler Scheduler assigns each job to a single core (no threading or shared memory) Custom Python framework for submitting jobs, mapping data and synchronizing between steps Master Node Jobs may or may not be on the same node Network File System (Lustre) Inter-process communication supported via file system or network Compute Nodes Number of cores used varied experimentally up to 64 Cluster SfM - 18

Cluster-based SfM Overview Step JPEG Photo JPEG Photo JPEG Photo Runtime Growth 1. Feature Extraction Job 1 Job 2 Job N p O(N) Remap SIFT keypoints 2. Feature Matching Job 1 Job 2 Job N p O(N 2 ) Remap Matched Feature Graph 3. Reconstruction Job 1 Job M O(N 2 /M) Remap Partial Point Clouds 4. Point Cloud Fusion Job 1 Job 2 Job N p O(P 2 /M) Combine Results Single Job Final Point Cloud Cluster SfM - 19 N: Input photo count; M: Partition count; P: Sparse point count

Feature Matching Photo ID Matching Pair Matrix Photo ID 1 2 3 4 N 1 0 1 1 1 1 2 0 0 1 1 1 3 0 0 0 1 1 4 0 0 0 0 1 (1,10) (2,12) (3,15) Matching Graph (notional) (1,50) (5,81) (4,75) N-1 0 0 0 0 1 N 0 0 0 0 0 (2,20) Vertices described by tuple (Photo ID, Keypoint ID) If (i,j) = 1, then match photo i to photo j Matching is most expensive step, growing O(N 2 ) with number of photos Results of matching represented as graph G: Vertices V represent SIFT points Edges E represent SIFT matches Cluster SfM - 20

Bundle Adjustment Data Mapping Flight path partitioned using block-cyclic mapping: N photos mapped to M partitions Assume photo set covers one revolution Less than 30 Example: Block size: 2; M=3 Flight Path Angular gap between photos should be less than 30 degrees (to help with SIFT matching) Photos per partition (N/M) must be greater than 25 Block size and partition count set automatically Reconstructing partial photo sets independently creates challenges: Point clouds are in different arbitrary coordinate systems Many points are duplicated across the partial point clouds 3D points have shorter view lists (fewer iterative improvements) Cluster SfM - 21

Identifying Duplicate Points Each 3D point has a view list comprised of SIFT keypoints (vertices in G). View lists from different partitions will always be disjoint because photos are mapped to exactly one partition. (1,10) Subgraph of G (5,81) (3,15) (2,12) View list A (4,75) View list B 3D points are considered duplicates if any vertices in B are in the neighborhood of A. Cluster SfM - 22

Transforming to a Common Space Determine rotation, scale and translation based on duplicate points: Robustly eliminate outlier 3D points Solve for transformation matrix that minimizes error Combine duplicate points as a weighted average Sparse point clouds after independent bundle adjustment on 4 partitions Fused sparse point cloud after transformation Cluster SfM - 23

Cluster SfM - 24 Final Point Cloud

Outline Feature-based 3D Reconstruction Reconstructing Aerial Video Cluster-based Structure from Motion Results Cluster SfM - 25

Runtime Performance Cluster-based SfM (64 cores) Bundler (single-core) Runtime (hours) 1 0.8 0.6 0.4 0.2 0 100 200 300 400 Image Set Size Feat. Match & Extr. Bundle Adj. Fuse Runtime (hours) Runtime (hours) 60 40 20 0 30 20 10 0 100 200 300 400 VisualSFM Image (multi-core Set Size & GPU) Feat. Extr. Feat. Match Bundle Adj. 100 200 300 400 Image Set Size Feat. Extr. Feat. Match Bundle Adj. Cluster SfM - 26

Speed-Up & Scalability Cluster-based SfM Speed-up 45 Speed-up (t(1)/t(n p )) 40 35 30 25 20 15 10 5 0 0 16 32 48 64 N=75 100 200 300 400 Cluster Size Cluster SfM Scalability (400 photos) 8 16 32 64 Partitions 8 16 16 16 Frame rate (per min.) Points per Frame (x10 3 ) 1.2 2.4 4.3 7.6 2.0.87.87.87 Number of Processors (N p ) Cluster SfM - 27

Reconstruction Quality Sparse Point Cloud Density 300 Points (x1,000) 250 200 150 100 50 Bundler VisualSFM Cluster (N/M=25) 0 50 100 150 200 250 300 350 400 Image Set Size RMS Error of Reconstructed Points (meters) Input size (N) 100 200 300 400 VisualSFM 0.20 7.15 1.05 0.44 4 Partitions 0.10 Cluster-based SfM 8 Partitions 0.27 12 Partitions 0.26 16 Partitions 0.24 Cluster SfM - 28

Summary Creating 3D reconstructions of photo collections scales well to the cluster 64-core cluster shows 14x speedup over multi-core workstation with GPU Our algorithm effectively parallelizes reconstruction for the case of aerial video around a ground point For algorithms that aren t pleasantly parallel, deep knowledge or application and computing environment necessary to parallelize Future work: Reduce feature matching complexity for different types of input sets Point cloud geo-registration by matching against reference data (e.g. LIDAR or satellite imagery) Cluster SfM - 29

Acknowledgements Scott Sawyer (scott.sawyer@ll.mit.edu) SIGMA Team: Karl Ni Alex Vasile Nick Armstrong-Crews Ethan Phelps Nadya Bliss Prof. Noah Snavely (Cornell University) Cluster SfM - 30