Supplementary Note to Detecting Building-level Changes of a City Using Street Images and a 2D City Map

Similar documents
ORB SLAM 2 : an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras

Planetary Rover Absolute Localization by Combining Visual Odometry with Orbital Image Measurements

Structure from Motion. Introduction to Computer Vision CSE 152 Lecture 10

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

EECS 442: Final Project

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.

Enhanced Hemisphere Concept for Color Pixel Classification

OpenStreetSLAM: Global Vehicle Localization using OpenStreetMaps

Real-time Image-based Reconstruction of Pipes Using Omnidirectional Cameras

3D object recognition used by team robotto

CHAPTER 3 DISPARITY AND DEPTH MAP COMPUTATION

arxiv: v1 [cs.cv] 28 Sep 2018

Bus Detection and recognition for visually impaired people

A Systems View of Large- Scale 3D Reconstruction

Augmenting Reality, Naturally:

Live Metric 3D Reconstruction on Mobile Phones ICCV 2013

Camera Drones Lecture 3 3D data generation

Target Shape Identification for Nanosatellites using Monocular Point Cloud Techniques

Multiview Stereo COSC450. Lecture 8

CS 231A Computer Vision (Winter 2014) Problem Set 3

LOCAL AND GLOBAL DESCRIPTORS FOR PLACE RECOGNITION IN ROBOTICS

Towards a visual perception system for LNG pipe inspection

Construction Progress Management and Interior Work Analysis Using Kinect 3D Image Sensors

Visual Pose Estimation System for Autonomous Rendezvous of Spacecraft

Topics to be Covered in the Rest of the Semester. CSci 4968 and 6270 Computational Vision Lecture 15 Overview of Remainder of the Semester

Application questions. Theoretical questions

CS 532: 3D Computer Vision 7 th Set of Notes

The Geometry of Carpentry and Joinery

Srikumar Ramalingam. Review. 3D Reconstruction. Pose Estimation Revisited. School of Computing University of Utah

AR Cultural Heritage Reconstruction Based on Feature Landmark Database Constructed by Using Omnidirectional Range Sensor

Joint Vanishing Point Extraction and Tracking (Supplementary Material)

Data Acquisition, Leica Scan Station 2, Park Avenue and 70 th Street, NY

CSCI 5980/8980: Assignment #4. Fundamental Matrix

Stereo and Epipolar geometry

Detecting Changes in 3D Structure of a Scene from Multi-view Images Captured by a Vehicle-mounted Camera

Performance Evaluation Metrics and Statistics for Positional Tracker Evaluation

/10/$ IEEE 4048

3D Point Cloud Processing

Computer Vision with MATLAB MATLAB Expo 2012 Steve Kuznicki

3D Reconstruction from Scene Knowledge

Multiview Photogrammetry 3D Virtual Geology for everyone

Factorization with Missing and Noisy Data

Improving Initial Estimations for Structure from Motion Methods

3D Digitization of a Hand-held Object with a Wearable Vision Sensor

Figure 1: Workflow of object-based classification

Master Automática y Robótica. Técnicas Avanzadas de Vision: Visual Odometry. by Pascual Campoy Computer Vision Group

Visual Odometry. Features, Tracking, Essential Matrix, and RANSAC. Stephan Weiss Computer Vision Group NASA-JPL / CalTech

SHORTEST PATH ANALYSES IN RASTER MAPS FOR PEDESTRIAN NAVIGATION IN LOCATION BASED SYSTEMS

Real Time Localization and 3D Reconstruction

Alignment of Continuous Video onto 3D Point Clouds

A Robust Method of Facial Feature Tracking for Moving Images

SIMPLE ROOM SHAPE MODELING WITH SPARSE 3D POINT INFORMATION USING PHOTOGRAMMETRY AND APPLICATION SOFTWARE

Toward Part-based Document Image Decoding

SOME stereo image-matching methods require a user-selected

Supplementary Material for A Multi-View Stereo Benchmark with High-Resolution Images and Multi-Camera Videos

Visual Pose Estimation and Identification for Satellite Rendezvous Operations

Accurate 3D Face and Body Modeling from a Single Fixed Kinect

A Volumetric Method for Building Complex Models from Range Images

EE368 Project: Visual Code Marker Detection

Marker-Less Augmented Reality Framework Using On-Site 3D Line-Segment-based Model Generation

Object Geolocation from Crowdsourced Street Level Imagery

CS 4758: Automated Semantic Mapping of Environment

Automatic Photo Popup

CS 231A Computer Vision (Fall 2012) Problem Set 3

3D RECONSTRUCTION is a fundamental problem in computer

IDE-3D: Predicting Indoor Depth Utilizing Geometric and Monocular Cues

arxiv: v1 [cs.cv] 28 Sep 2018

MULTI-VIEW TARGET CLASSIFICATION IN SYNTHETIC APERTURE SONAR IMAGERY

Large Scale 3D Reconstruction by Structure from Motion

Detecting motion by means of 2D and 3D information

A GEOMETRIC SEGMENTATION APPROACH FOR THE 3D RECONSTRUCTION OF DYNAMIC SCENES IN 2D VIDEO SEQUENCES

A System of Image Matching and 3D Reconstruction

BUNDLE ADJUSTMENT (BA) is known to be the optimal

Homographies and RANSAC

Davide Scaramuzza. University of Zurich

Scan-point Planning and 3-D Map Building for a 3-D Laser Range Scanner in an Outdoor Environment

Room Reconstruction from a Single Spherical Image by Higher-order Energy Minimization

Camera Registration in a 3D City Model. Min Ding CS294-6 Final Presentation Dec 13, 2006

Calibration of a rotating multi-beam Lidar

CS 223B Computer Vision Problem Set 3

CS395T paper review. Indoor Segmentation and Support Inference from RGBD Images. Chao Jia Sep

PHOTOGRAMMETRIC SOLUTIONS OF NON-STANDARD PHOTOGRAMMETRIC BLOCKS INTRODUCTION

Robust Model-Free Tracking of Non-Rigid Shape. Abstract

CSE 527: Introduction to Computer Vision

Cluster-based 3D Reconstruction of Aerial Video

A New Technique of Extraction of Edge Detection Using Digital Image Processing

Personal Navigation and Indoor Mapping: Performance Characterization of Kinect Sensor-based Trajectory Recovery

K-Means Based Matching Algorithm for Multi-Resolution Feature Descriptors

Srikumar Ramalingam. Review. 3D Reconstruction. Pose Estimation Revisited. School of Computing University of Utah

Chapter 5. Conclusions

Car tracking in tunnels

Tutorial 14b: Advanced polygonal modeling

3D Object Representations. COS 526, Fall 2016 Princeton University

Factorization Method Using Interpolated Feature Tracking via Projective Geometry

Computer Vision I - Algorithms and Applications: Multi-View 3D reconstruction

Accurate and Dense Wide-Baseline Stereo Matching Using SW-POC

Accurate Motion Estimation and High-Precision 3D Reconstruction by Sensor Fusion

Classification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University

Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation. Range Imaging Through Triangulation

Transcription:

Supplementary Note to Detecting Building-level Changes of a City Using Street Images and a 2D City Map Daiki Tetsuka Takayuki Okatani Graduate School of Information Sciences, Tohoku University okatani@vision.is.tohoku.ac.jp Preprocessing: SfM and registration As mentioned above, the estimation of building existences (the thick line box of Fig.4 of the main paper) is preceded by SfM followed by the registration of its result with the map. These preprocessings are summarized here. We employ a standard approach for SfM. Feature points are first extracted using SURF [1] in each image of the sequence, and then they are matched based on descriptor similarity to obtain putative correspondences between each neighboring pair of images. The outliers are then discarded from them by using RANSAC with the five point algorithm [5], yielding a number of point trajectories and initial estimates of the camera poses. Their 3D positions are then estimated by triangulation, followed by the application of robust bundle adjustment. It should be noted that we match feature points only between consequtive images in the sequence; thus, once a scene point ceases to be tracked (e.g., when it is out of image area), the same scene point will be treated as a new point if it has appeared in images again. If one uses an appropriate matching scheme, these points will be correctly matched and identified as a single scene point, which enables loop closure and then could improve the accuracy of reconstruction. We do not do so to avoid the resulting increase in computational cost. We perform SfM in an open-loop manner, thereby being able to have a highly sparse structure resulting in a small computational cost. This is important as we are considering the city-scale reconstruction. Moreover, the accuracy of camera trajectories is not particularly a problem in our case, as we have GPS data for each camera position. In brief, when the vehicle run the same place of a city multiple times, the same scene points can be reconstructed as different points. This will be appropriately considered in our method described later. The SfM reconstruction thus obtained is then registered with the map by finding a similarity transformation such that the transformed camera positions and their GPS data are as close to each other as possible in the least squares sense. As the SfM reconstruction inevitablly suffers from drifts, we divide the whole sequence into a number of subsequences whose length is about 100 images, or equivalently, 200 meters in travel distance, for each of which SfM is performed. This also contribute to the reduction of the total computational cost. Not to overlook buildings at the both ends of each subsequence, the original sequence is divided so that the ends of the neighboring subsequences overlap with each other. An example of the SfM reconstruction after the registration with the map is shown in Fig.1. In our 2D map of a city, each building is represented as a polygon approximating its ground projection. It is impossible to recover the original 3D shapes of the buildings. Thus, we assume that the polygons depict its outer walls of the first floor and also these walls have be at least five meters high. When comparing the SfM point cloud with the buildings, we will use only the points up to the five meters high above the ground level and neglect other points. More images for the results shown in the main paper In the main paper, we report the results of the proposed change detection method for three cities. Additional images omitted in the main paper are shown in Fig.11, 12, and 13. These show how the proposed method iteratively judges the existences of buildings. Note that Fig.11 is an enlarged version of Fig. 8. Analysis of failure cases In the experiments reported in the main paper (Table.1), the proposed method did not achieve 100% accuracy, and there are a certain amount of errors in the results. Failure cases are classified into two types, although their boundary is somewhat obscure. One is the case attributable to the limitation with the proposed method, and the other is the case attributable to the fundamental difficulty with labeling existence/non-existence to each building. The former occurs mostly when the assumptions of the method are violated. Figure 14(a) shows examples. This image is captured in a backlight situation, which leads to a 1

difficulty with the extraction of feature points, resulting in a failure. It is also seen in the same image that a building has lost its walls due to the tsunami or a fire caused by it. This leads to the same difficulty, which is also the case when walls do not have sufficient textures. There are some buildings that are difficult even for human to judge their existence/non-existence, which could cause the second failure cases. Figure 14(b) shows an example, where only foundations of buildings remain after the tsunami. Figure 14(c) shows another example, a partially demolished building. Because of the nature of the disaster, there are many such buildings in the cities considered in our experiments. Whether they should be labeled as existing or non-existing will depend on application. In our experiments, we label all of these buildings as existing in the ground truths.

Initial Iteration 1 2 3 4 5 6(Final) Ground truth Figure 11. Intermediate results at different iteration counts for Otsuchi Town. (An enlarged version of Fig.8 of the main paper). Best viewed on a color monitor. Green, blue, and red polygons indicate the buildings that have not been judged yet, those judged to be existing, and those judged to be non-existing, respectively. The camera trajectory is displayed in dark blue.

Initial Iteration 1 2 3 4(Final) Ground truth Figure 12. Intermediate results at different iteration counts for Miyagino Ward.

Initial Iteration 1 2 3(final) Ground truth Figure 13. Intermediate results at different iteration counts for Kamaishi City.

(a) (b) (c) (d) Figure 14. Example failure cases. (a) Backlight and buildings without wall. (b) Only building foundations remain. (c) Partially demolished building. (d) The reconstructed point cloud overlaid on its original buidling shape.