Photo Tourism: Exploring Photo Collections in 3D

Similar documents
Noah Snavely Steven M. Seitz. Richard Szeliski. University of Washington. Microsoft Research. Modified from authors slides

Photo Tourism: Exploring Photo Collections in 3D

Modeling and Recognition of Landmark Image Collections Using Iconic Scene Graphs

Homographies and RANSAC

Image correspondences and structure from motion

Camera Calibration. COS 429 Princeton University

Photo Tourism: Exploring Photo Collections in 3D

Large Scale 3D Reconstruction by Structure from Motion

A Systems View of Large- Scale 3D Reconstruction

Instance-level recognition

Region Graphs for Organizing Image Collections

The Light Field and Image-Based Rendering

Local Feature Detectors

Multi-view stereo. Many slides adapted from S. Seitz

Instance-level recognition

Lecture 15: Image-Based Rendering and the Light Field. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

Instance-level recognition part 2

Chaplin, Modern Times, 1936

SIMPLE ROOM SHAPE MODELING WITH SPARSE 3D POINT INFORMATION USING PHOTOGRAMMETRY AND APPLICATION SOFTWARE

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

Structure from Motion CSC 767

Structure from motion

Local features: detection and description. Local invariant features

BIL Computer Vision Apr 16, 2014

Image-Based Modeling and Rendering

Midterm Examination CS 534: Computational Photography

Image Based Reconstruction II

From Structure-from-Motion Point Clouds to Fast Location Recognition

Image stitching. Announcements. Outline. Image stitching

Image stitching. Digital Visual Effects Yung-Yu Chuang. with slides by Richard Szeliski, Steve Seitz, Matthew Brown and Vaclav Hlavac

Structure from motion

Image Stitching. Slides from Rick Szeliski, Steve Seitz, Derek Hoiem, Ira Kemelmacher, Ali Farhadi

Instance-level recognition II.

Miniature faking. In close-up photo, the depth of field is limited.

Final project bits and pieces

CS6670: Computer Vision

Dense 3D Reconstruction. Christiano Gava

Stereo and Epipolar geometry

Mosaics. Today s Readings

Stereo Epipolar Geometry for General Cameras. Sanja Fidler CSC420: Intro to Image Understanding 1 / 33

Today s lecture. Image Alignment and Stitching. Readings. Motion models

Stereo vision. Many slides adapted from Steve Seitz

Painting-to-3D Model Alignment Via Discriminative Visual Elements. Mathieu Aubry (INRIA) Bryan Russell (Intel) Josef Sivic (INRIA)

Local features: detection and description May 12 th, 2015

CS 2770: Intro to Computer Vision. Multiple Views. Prof. Adriana Kovashka University of Pittsburgh March 14, 2017

Step-by-Step Model Buidling

arxiv: v1 [cs.cv] 28 Sep 2018

Image warping and stitching

Computational Photography: Advanced Topics. Paul Debevec

Computer Vision Lecture 17

Computer Vision Lecture 17

Lecture 10: Multi view geometry

Structure from Motion

Multiple View Geometry

Patch Descriptors. EE/CSE 576 Linda Shapiro

Local features and image matching. Prof. Xin Yang HUST

C280, Computer Vision

Object Recognition and Augmented Reality

Structure from Motion

Patch Descriptors. CSE 455 Linda Shapiro

Image warping and stitching

Hough Transform and RANSAC

Camera Geometry II. COS 429 Princeton University

Algorithms for Image-Based Rendering with an Application to Driving Simulation

Geometry for Computer Vision

Dense 3D Reconstruction. Christiano Gava

Feature Transfer and Matching in Disparate Stereo Views through the use of Plane Homographies

Midterm Wed. Local features: detection and description. Today. Last time. Local features: main components. Goal: interest operator repeatability

Camera Drones Lecture 3 3D data generation

Structure from motion

3D model search and pose estimation from single images using VIP features

Structure from Motion. Introduction to Computer Vision CSE 152 Lecture 10

Fundamental matrix. Let p be a point in left image, p in right image. Epipolar relation. Epipolar mapping described by a 3x3 matrix F

Srikumar Ramalingam. Review. 3D Reconstruction. Pose Estimation Revisited. School of Computing University of Utah

Epipolar Geometry and Stereo Vision

Morphable 3D-Mosaics: a Hybrid Framework for Photorealistic Walkthroughs of Large Natural Environments

What have we leaned so far?

Image-based Modeling and Rendering: 8. Image Transformation and Panorama

CS4670: Computer Vision

Image warping and stitching

Lecture 10: Multi-view geometry

CS664 Lecture #19: Layers, RANSAC, panoramas, epipolar geometry

Epipolar Geometry CSE P576. Dr. Matthew Brown

Automatic Image Alignment

Automatic Image Alignment

A Scalable Collaborative Online System for City Reconstruction

1 Projective Geometry

The SIFT (Scale Invariant Feature

Image stitching. Digital Visual Effects Yung-Yu Chuang. with slides by Richard Szeliski, Steve Seitz, Matthew Brown and Vaclav Hlavac

Srikumar Ramalingam. Review. 3D Reconstruction. Pose Estimation Revisited. School of Computing University of Utah

Advanced Digital Photography and Geometry Capture. Visual Imaging in the Electronic Age Lecture #10 Donald P. Greenberg September 24, 2015

EE795: Computer Vision and Intelligent Systems

Image-Based Modeling and Rendering

Prof. Trevor Darrell Lecture 18: Multiview and Photometric Stereo

Epipolar Geometry and Stereo Vision

RANSAC: RANdom Sampling And Consensus

CS4670: Computer Vision

Sparse 3D Model Reconstruction from photographs

Transcription:

Photo Tourism: Exploring Photo Collections in 3D SIGGRAPH 2006 Noah Snavely Steven M. Seitz University of Washington Richard Szeliski Microsoft Research 2006 2006 Noah Snavely Noah Snavely

Reproduced with permission of Yahoo! Inc. 2005 by Yahoo! Inc. YAHOO! and the YAHOO! logo are trademarks of Yahoo! Inc.

Photo Tourism

Photo Tourism overview Input photographs Scene reconstruction Relative camera positions and orientations Point cloud Sparse correspondence Photo Explorer

Related work Image-based modeling Debevec, et al. SIGGRAPH 1996

FAÇADE: ADE: Modeling and Rendering Architecture from Photographs P. Debevec, C. J. Taylor, J. Malik SIGGRAPH 1997

Overview Take a few widely spaced photographs Build simple underlying model of scene Use correspondences between photos to adjust scene parameters Paste photos back onto simple geometry of scene for realistic façade Slide credit: D. Luebke

Photogrammetric Modeling User builds a simple geometric model using blocks: primitive solid shapes Example: boxes, wedges, prisms, frusta, surfaces of revolution User marks correspondences between images and model System fits model to images Slide credit: D. Luebke

Photogrammetric Modeling The system needs to solve for the parameters of blocks Height, width, translation, rotation, etc. Slide credit: D. Luebke

Photogrammetric Modeling Knowns: image segments to block edge correspondences Unknowns: block parameters, camera position/orientation Architectural constraints reduce the number of unknowns Slide credit: D. Luebke

Photogrammetric Modeling Architectural constraints Roof prism lies flush on building block Stacked tower blocks share center axis Slide credit: D. Luebke

Example result Three of 12 input photographs for the University High School, Urbana, IL: Slide credit: D. Luebke

View-Dependent Texture Mapping Given the model, treat each camera position as a slide projector Some images overlap! Idea: pick image taken from viewpoint closest to desired rendering viewpoint Better: use weighted average or do texture mapping on a per-pixel basis Slide credit: D. Luebke

View-Dependent Texture Mapping

Photogrammetric Modeling The Campanile, Berkeley

Photogrammetric Modeling Model of Berkeley campus constructed from 15 photographs:

The Campanile Movie http://www.debevec.org/campanile/

Related work Image-based modeling Debevec, et al. SIGGRAPH 1996 Schaffalitzky and Zisserman ECCV 2002 Brown and Lowe 3DIM 2005 Image-based rendering Aspen Movie Map Lippman, et al., 1978 Photorealistic IBR: Levoy and Hanrahan, SIGGRAPH 1996 Gortler, et al, SIGGRAPH 1996 Seitz and Dyer, SIGGRAPH 1996 Aliaga, et al, SIGGRAPH 2001 and many others

Photo Tourism overview Scene reconstruction Input photographs Photo Explorer

Scene reconstruction Automatically estimate position, orientation, and focal length of cameras 3D positions of feature points Feature detection Pairwise feature matching Incremental structure from motion Correspondence estimation

Feature detection Detect features using SIFT [Lowe, IJCV 2004]

Feature detection Detect features using SIFT [Lowe, IJCV 2004]

Feature detection Detect features using SIFT [Lowe, IJCV 2004]

Feature matching Match features between each pair of images

Feature matching Refine matching using RANSAC [Fischler & Bolles 1987] to estimate fundamental matrices between pairs

Correspondence estimation Link up pairwise matches to form connected components of matches across several images Image 1 Image 2 Image 3 Image 4

Structure from motion p 1 p 4 p 3 p 2 minimize f(r,t,p) p 5 p 6 p 7 Camera 1 Camera 3 R 1,t 1 Camera 2 R 3,t 3 R 2,t 2

Incremental structure from motion

Incremental structure from motion

Incremental structure from motion

Reconstruction performance For photo sets from the Internet, 20% to 75% of the photos were registered Unregistered photos: Running time: < 1 hour for 80 photos > 1 week for 2600 photo

Photo Tourism overview Input photographs Scene reconstruction Photo Explorer

Photo Explorer

Photo Tourism overview Input photographs Scene reconstruction Photo Explorer Navigation Rendering Annotations

Object-based based browsing

Object-based based browsing Visibility Resolution Head-on view

Relation-based browsing Find all similar images Find all details Find all zoom outs Zoom in Move left Move right Zoom out

Relation-based browsing Image A Image B

Relation-based browsing Image A Image B

Relation-based browsing to the right of Image A Image B Image C

Relation-based browsing to the right of Image A Image B Image C

Relation-based browsing to the right of Image A Image B is detail of Image C Image D

Relation-based browsing to the right of Image A Image B is detail of Image C Image D

Relation-based browsing to the left of to the right of Image A Image B is zoom-out of is detail of is detail of is detail of is zoom-out of is zoom-out of Image C is detail of Image D

Overhead map

Prague Old Town Square

Photo Tourism overview Input photographs Scene reconstruction Photo Explorer Navigation Rendering Annotations

Rendering

Rendering

Rendering

Rendering transitions

Rendering transitions

Rendering transitions

Rendering transitions Camera A Camera B

Photo Tourism overview Input photographs Scene reconstruction Photo Explorer Navigation Rendering Annotations

Annotations

Annotations Reproduced with permission of Yahoo! Inc. 2005 by Yahoo! Inc. YAHOO! and the YAHOO! logo are trademarks of Yahoo! Inc.

Annotations

Microsoft Photosynth Project Demo

Limitations / Future work Not all photos can be reliably matched Better feature detection / matching Integrating GPS & other localization info. Structure from motion scalability ECCV 2008 paper by Li et al. Improved navigation, transitions SIGGRAPH 2008

Finding Paths Through the World s Photos N. Snavely, R. Garg, S. Seitz, R. Szeliski SIGGRAPH 2008

Advances over Photo Tourism Better view interpolation for landmarks that are not well approximated by planar facades Fluid, freeform 3D navigation Automatic discovery of scene-specific controls: orbits, panoramas Path planning for transitions between different viewpoints

System components Input photographs, structure from motion Viewpoint scoring function Estimates the quality of rendering a particular input image from a given viewpoint Navigation controls Automatically discover orbits, panoramas, representative views Path planning Rendering engine For any user position, find best scoring input image and reproject it from the current viewpoint Orbit stabilization, appearance stabilization

Viewpoint Scoring Function Ideally, measure the difference between the reprojected view and a real photo of the scene captured at a given location Quality criteria: Angular deviation Field of view Resolution

Scene-Specific Specific Navigation Controls Orbits An orbit is a distribution of views in a circle converging toward the same object An orbit is specified by an axis and an image Scoring possible orbits: quality, length, convergence, object-centeredness

Scene-Specific Specific Navigation Controls Orbits An orbit is a distribution of views in a circle converging toward the same object An orbit is specified by an axis and an image Scoring possible orbits: quality, length, convergence, object-centeredness Orbit enumeration: first find likely axes, then evaluate orbit scoring function for every image and that axis

Detected orbits Statue of Liberty Pantheon

Scene-Specific Specific Navigation Controls Orbits Panoramas A set of images taken from approximately the same viewpoint and covering a good range of viewing directions Canonical views Scene summarization algorithm of Simon et al. [ICCV 2007] Choose canonical view for each orbit and panorama, show as thumbnail in scene viewer

Path Planning for Transitions Automatically compute path from one image to another, show multiple reprojected images between the two endpoints (unlike Photo Tourism) Discrete solution: find shortest path in a transition graph defined by input views Transition cost between two images: computed based on viewpoint scoring function The discrete shortest path can be improved by smoothing

Path Planning Example

Orbit Stabilization Images are reprojected using a proxy plane Planes fit to scene points can cause jerky motion, especially for orbits For orbit stabilization, choose a proxy plane that passes through the orbit axis and is (roughly) parallel to the source image plane

Appearance Stabilization Compute distance between geometrically and photometrically aligned images Similarity mode: favor transitions between similar images Color compensation: estimate gain and offset between two images, adjust next image to match the previously viewed one

Appearance Stabilization All images Similar images Color compensation

Modeling and Recognition of Landmark Image Collections Using Iconic Scene Graphs Xiaowei Li, Changchang Wu, Christopher Zach, Svetlana Lazebnik, Jan-Michael Frahm ECCV 2008

Overview All images Appearancebased clustering, geometric verification Iconic images Pairwise matching of iconic images SFM Graph cut Iconic scene graph Reconstructed components Components of iconic scene graph

The approach 1. Appearance-based clustering Goal: find groups of images with roughly similar viewpoints and scene conditions

The approach 1. Appearance-based clustering Goal: find groups of images with roughly similar viewpoints and scene conditions Run k-means clustering with gist descriptors (Oliva & Torralba, 2001)

The approach 1. Appearance-based clustering 2. Geometric verification and iconic image selection Perform feature-based geometric matching between a few top images from each cluster QDEGSAC (Frahm & Pollefeys, 2006) for robust estimation of fundamental matrix or homography Select an iconic image for each cluster as the image with the most total inliers

The approach 1. Appearance-based clustering 2. Geometric verification and iconic image selection

The approach 1. Appearance-based clustering 2. Geometric verification and iconic image selection 3. Construction of iconic scene graph Perform geometric matching between every pair of iconic images Create a weighted edge for every pair related by a homography or a fundamental matrix

The approach 1. Appearance-based clustering 2. Geometric verification and iconic image selection 3. Construction of iconic scene graph

The approach 1. Appearance-based clustering 2. Geometric verification and iconic image selection 3. Construction of iconic scene graph 4. Finding graph components Use tags to eliminate semantically irrelevant isolated nodes Run normalized cuts to break up the rest of the graph into smaller tightly connected sub-graphs

The approach 1. Appearance-based clustering 2. Geometric verification and iconic image selection 3. Construction of iconic scene graph 4. Finding graph components

The approach 1. Appearance-based clustering 2. Geometric verification and iconic image selection 3. Construction of iconic scene graph 4. Finding graph components 5. Structure from motion Perform SFM separately on each component Maximum-weight spanning tree determines the order of incorporating images into the 3D model If possible, merge component models using geometric relationships along edges that were originally cut Register additional non-iconic images to the models (optional)

Hierarchical browsing Level 1: components of iconic scene graph Level 2: iconic images belonging to each component Level 3: images inside the gist cluster of each iconic Level 1 Level 2 Level 3

Statue of Liberty dataset 45284 images 196 iconic images

Statue of Liberty dataset 45284 images Las Vegas Tokyo New York Registered images in largest model: 871 Points visible in 3+ views: 18675

Notre Dame dataset 10840 images 105 iconic images

Notre Dame dataset 10840 images Registered images in largest model: 337 Points visible in 3+ views: 30802

San Marco dataset 43557 images Registered images in largest model: 749 Points visible in 3+ views: 39307