Photo Tourism: Exploring Photo Collections in 3D

Similar documents
Noah Snavely Steven M. Seitz. Richard Szeliski. University of Washington. Microsoft Research. Modified from authors slides

Photo Tourism: Exploring Photo Collections in 3D

Photo Tourism: Exploring Photo Collections in 3D

A Systems View of Large- Scale 3D Reconstruction

Sparse 3D Model Reconstruction from photographs

Lecture 15: Image-Based Rendering and the Light Field. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

Large Scale 3D Reconstruction by Structure from Motion

The Light Field and Image-Based Rendering

Homographies and RANSAC

CS 4758: Automated Semantic Mapping of Environment

Camera Calibration. COS 429 Princeton University

Local Feature Detectors

arxiv: v1 [cs.cv] 28 Sep 2018

BSB663 Image Processing Pinar Duygulu. Slides are adapted from Selim Aksoy

A New Representation for Video Inspection. Fabio Viola

Dense 3D Reconstruction. Christiano Gava

A System of Image Matching and 3D Reconstruction

Image Processing: Motivation Rendering from Images. Related Work. Overview. Image Morphing Examples. Overview. View and Image Morphing CS334

Scene Modeling for a Single View

Opportunities of Scale

Image correspondences and structure from motion

3D Wikipedia: Using online text to automatically label and navigate reconstructed geometry

From Structure-from-Motion Point Clouds to Fast Location Recognition

Prof. Noah Snavely CS Administrivia. A4 due on Friday (please sign up for demo slots)

Stereo Epipolar Geometry for General Cameras. Sanja Fidler CSC420: Intro to Image Understanding 1 / 33

Image-Based Rendering. Johns Hopkins Department of Computer Science Course : Rendering Techniques, Professor: Jonathan Cohen

Step-by-Step Model Buidling

Miniature faking. In close-up photo, the depth of field is limited.

arxiv: v1 [cs.cv] 28 Sep 2018

Advanced Digital Photography and Geometry Capture. Visual Imaging in the Electronic Age Lecture #10 Donald P. Greenberg September 24, 2015

International Journal for Research in Applied Science & Engineering Technology (IJRASET) A Review: 3D Image Reconstruction From Multiple Images

Chapter 3 Image Registration. Chapter 3 Image Registration

3D Fusion of Infrared Images with Dense RGB Reconstruction from Multiple Views - with Application to Fire-fighting Robots

3D exploitation of large urban photo archives

Structure from Motion

Instance-level recognition

Dense 3D Reconstruction. Christiano Gava

Region Graphs for Organizing Image Collections

Hierarchical Photo Organization using Geo-Relevance

Instance-level recognition

Chaplin, Modern Times, 1936

A Review of Image- based Rendering Techniques Nisha 1, Vijaya Goel 2 1 Department of computer science, University of Delhi, Delhi, India

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

Social Tourism Using Photos

Tutorial (Beginner level): Orthomosaic and DEM Generation with Agisoft PhotoScan Pro 1.3 (with Ground Control Points)

Cybercity Walker - Layered Morphing Method -

An Algorithm for Seamless Image Stitching and Its Application

Advanced Digital Photography and Geometry Capture. Visual Imaging in the Electronic Age Lecture #10 Donald P. Greenberg September 24, 2015

The SIFT (Scale Invariant Feature

Image-Based Modeling and Rendering

Improving Initial Estimations for Structure from Motion Methods

Structure from Motion CSC 767

Image-Based Rendering. Johns Hopkins Department of Computer Science Course : Rendering Techniques, Professor: Jonathan Cohen

CS4670: Computer Vision

Local Features: Detection, Description & Matching

CS770/870 Spring 2017 Animation Basics

CS770/870 Spring 2017 Animation Basics

Morphable 3D-Mosaics: a Hybrid Framework for Photorealistic Walkthroughs of Large Natural Environments

LS-ACTS 1.0 USER MANUAL

Fast and robust techniques for 3D/2D registration and photo blending on massive point clouds

Midterm Examination CS 534: Computational Photography

Structure from motion

Image-Based Rendering. Image-Based Rendering

Image Based Rendering. D.A. Forsyth, with slides from John Hart

Cluster-based 3D Reconstruction of Aerial Video

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

arxiv: v1 [cs.cv] 1 Jan 2019

3D object recognition used by team robotto

Agisoft PhotoScan Tutorial

Structure from Motion

1. Interpreting the Results: Visualization 1

123D Catch - Tutorial

insight3d quick tutorial

Tutorial (Beginner level): Orthomosaic and DEM Generation with Agisoft PhotoScan Pro 1.3 (without Ground Control Points)

A New Approach For 3D Image Reconstruction From Multiple Images

Chapter 12 3D Localisation and High-Level Processing

CSE 527: Introduction to Computer Vision

Geometry for Computer Vision

Learning Articulated Skeletons From Motion

3D Editing System for Captured Real Scenes

Epipolar Geometry CSE P576. Dr. Matthew Brown

Camera Geometry II. COS 429 Princeton University

CSE 252B: Computer Vision II

Building a Panorama. Matching features. Matching with Features. How do we build a panorama? Computational Photography, 6.882

Telecommunications Engineering Thesis

Computational Optical Imaging - Optique Numerique. -- Multiple View Geometry and Stereo --

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

EVOLUTION OF POINT CLOUD

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Topics to be Covered in the Rest of the Semester. CSci 4968 and 6270 Computational Vision Lecture 15 Overview of Remainder of the Semester

Image-based Modeling and Rendering: 8. Image Transformation and Panorama

Computational Photography: Advanced Topics. Paul Debevec

Project report Augmented reality with ARToolKit

Sea Turtle Identification by Matching Their Scale Patterns

Image Features: Local Descriptors. Sanja Fidler CSC420: Intro to Image Understanding 1/ 58

Srikumar Ramalingam. Review. 3D Reconstruction. Pose Estimation Revisited. School of Computing University of Utah

Adaptive Zoom Distance Measuring System of Camera Based on the Ranging of Binocular Vision

High Definition Modeling of Calw, Badstrasse and its Google Earth Integration

Planetary Rover Absolute Localization by Combining Visual Odometry with Orbital Image Measurements

E27 Computer Vision - Final Project: Creating Panoramas David Nahmias, Dan Spagnolo, Vincent Stigliani Professor Zucker Due 5/10/13

Transcription:

Click! Click! Oooo!! Click! Zoom click! Click! Some other camera noise!! Photo Tourism: Exploring Photo Collections in 3D Click! Click! Ahhh! Click! Click! Overview of Research at Microsoft, 2007 Jeremy Kuzub Presented to Systems and Computer Engineering Faculty, Carleton University, 2007

A stack of photos. is not how we see reality! Imagine exploring a 3D space where each photo is hung as a window into the real 3D environment:

Challenge: Where was this photo taken? Can we algorithmically derive the 3D spatial relation of a large group of 2D images? Can we use this to generate an interactive photo browser that puts photos in spatial context?

More Challenge: Can the group of photos include: different cameras different times of day and year Different viewing angles, focal lengths Various text annotations (flickr, Google Images, etc.) Can the data set be from reality and not from controlled tests?

The New Photo Collection Interface: Scene visualization. Fly around popular world sites in 3D by morphing between photos. Object-based photo browsing. Show me more images that contain this object or part of the scene. Where was I? Tell me where I was when I took this picture. What am I looking at? Tell me about objects visible in this image by transferring annotations from similar images. Natural Transitions between photos Find and use spatial relations Derive original camera location Identify objects within photos

And if your Mom asks? Mom query: What do you do in that lab all day with computers? Answer: I am a specialist in image-based modeling (IBM) and image-based rendering (IBR)! I specialize in synthesizing new views of a scene from a set of input photographs, Mom!

The Process: Image Annotations ( Cool Statue! etc.) IB Modeling IB Rendering Photo collection context and understanding! Photo Collection User Navigation System

Image Annotations Image Based Modeling Process: IB Modeling IB Rendering Photo collection context and understanding! Reconstructing cameras from photo pairs Photo Collection User Navigation System Unique Feature Identification SIFT algorithm Uniquely identifiable points in image 1 and 2 matching of points between images 1 and 2 Nearest-neighbor algorithm Matched point pairs in images 1 and 2 Determine best motion vectors between matched points RANSAC algorithm Unique point motion tracks between images 1 and 2 Reconstruct Camera parameters and 3D location from tracks Structure From Motion algorithm

Example

From photo pairs To photo sets New photos are added to existing photo pairs one at a time this reduces computational complexity The new photo must have some common unique points with the existing photo pair Bundle-adjustment is used to make sure all three cameras agree on photo locations This becomes computationally intensive with larger photo sets as agreement is sought for ALL cameras sharing common points. (minutes to hours to days for bundle-adjustment)

Pinning down all cameras Once the locations of cameras are determined relative to one another it is beneficial to lock down the entire scene in absolute space (i.e. which way is north?) A single camera or point in the group with associated GPS coordinates can lock down all cameras, but more data leads to more accuracy. Allows better context for a photo set: matching of photosets to existing datasets, such as geo-survey data or existing 3D geometry.

Image Annotations Image Based Rendering Process: IB Modeling IB Rendering Photo collection context and understanding! Reconstructing the scene in an intuitive way Photo Collection User Navigation System Database of each photo s camera location in 3D space (From Image-based Modeling) User interface and navigation 3D rendering engine Context and Understanding! Photo set

User interface: Flying 3D navigation Top-down map View of current photo Spatially related photos to current photos Annotations for objects within photos.

Navigation Window: Representing a sparse 3D space The 3D structure of the space derived from photos is a point cloud This sparse point cloud gives and impression of the environment espectially during transitions between camera locations photos fill in the details Removed the difficulty of texture mapping a full environment more flexibility of input image set (flickr, google etc) Space Shuttle image set point cloud representation with camera frustra

Alternate presentation Camera frustra, semi-transparent photos, and line segments from SfM ( Not ultimately selected for productized version of Photosynth)

Photo Transitions Tweening Path Camera 2 Camera 1 Movement from one camera (1 photo) to another: Linear Interpolation of camera position and rotation Smooth motion using acceleration and deceleration Smooth change in camera focal length during movement Distort images as planes in 3D during transition

Sample Transition between images

Annotation Transfer Original Annotations (Flickr, Facebook, etc) Annotations Algorithmically transferred to other photos Leverages common identifiable points (from SIFT algorithm) in multiple images. Annotations covering a set of points in one image can be transferred to other images that contain those same points Algorithm must determine what points are part of the Annotated area (points too close or far from the camera (out of plane) must be eliminated Algorithm to transfer annotations to other photos uses a weighting function based on : High number of relevant points in the photo Best viewing angle of all points The Annotated object occupies a high percentage of the photo

Time as a Dimension Using SIFT, common identifiable points in photos can be found with some invariance to weather, lighting, time of day, season, year. Caveat: SIFT is only partially invariant to lighting changes Images matched in this way can allows a user to navigate through time as well as space.

Image sorting by Similarity Images are ordered in such a way that adjacent images share the highest number of common identifiable points Allows a user to navigate in a way that is most similar to touring (walking, etc.) Thumbnail images along bottom of interface similar to conventional image browsing. Adjacent photos with 4 common points

Navigation Tool Zoom-out Search Function: find other photos which contain all the points of the current photo Criterion for success: The bounding box of all the these points must appear smaller in the candidate Zoom-Out photo Can also function as a find details or zoom-in function. Original photo with 4 points Candidate zoomed-out photo

Limitations Speed of bundle adjustment it can take hours or days to fit the SIFTdetermined keypoints. This increases with the number of photos in a set. Fortunately this is a one-time process and does not effect photo browsing performance. Representation of 3D space: Only keypoints are rendered in 3D space It is impractical to map photos as textures on to 3D geometry, since the photo set is incomplete and camera parameters cannot be determined accurately enough.

Advantages Images sets can be from the wild Flickr, Google, Facebook any online photo database Robust and intuitive navigation system gives spatial context to large sets of photos Photo sets can be locked to geographic data after-the-fact. Addition text annotation of objects within photos can be automatically transferred to other photos of the same object Photos can be added to the sets organically at a later date Wow-factor

</presentation> Pretty Amazing Demos: Google: photosynth Photo Tourism: Exploring Photo Collections in 3D Noah Snavely University ofwashington Steven M. Seitz University of Washington Richard Szeliski Microsoft Research