Talk plan. 3d model. Applications: cultural heritage 5/9/ d shape reconstruction from photographs: a Multi-View Stereo approach

Similar documents
Multi-View 3D-Reconstruction

Multi-view stereo. Many slides adapted from S. Seitz

Image Based Reconstruction II

CS5670: Computer Vision

Dense 3D Reconstruction. Christiano Gava

Multi-view Stereo. Ivo Boyadzhiev CS7670: September 13, 2011

Some books on linear algebra

Lecture 10: Multi-view geometry

Multiple View Geometry

Lecture 10: Multi view geometry

Dense 3D Reconstruction. Christiano Gava

3D Photography: Stereo Matching

Making Machines See. Roberto Cipolla Department of Engineering. Research team

BIL Computer Vision Apr 16, 2014

CSc Topics in Computer Graphics 3D Photography

Multi-View Stereo for Static and Dynamic Scenes

Prof. Trevor Darrell Lecture 18: Multiview and Photometric Stereo

Computer Vision Lecture 17

Computer Vision Lecture 17

Passive 3D Photography

Structured light 3D reconstruction

SIMPLE ROOM SHAPE MODELING WITH SPARSE 3D POINT INFORMATION USING PHOTOGRAMMETRY AND APPLICATION SOFTWARE

Recap from Previous Lecture

PERFORMANCE CAPTURE FROM SPARSE MULTI-VIEW VIDEO

Stereo vision. Many slides adapted from Steve Seitz

Step-by-Step Model Buidling

Project 3 code & artifact due Tuesday Final project proposals due noon Wed (by ) Readings Szeliski, Chapter 10 (through 10.5)

Chaplin, Modern Times, 1936

3D Scanning. Lecture courtesy of Szymon Rusinkiewicz Princeton University

Multi-View Stereo for Community Photo Collections Michael Goesele, et al, ICCV Venus de Milo

EE795: Computer Vision and Intelligent Systems

Lecture 8 Active stereo & Volumetric stereo

3D Computer Vision. Dense 3D Reconstruction II. Prof. Didier Stricker. Christiano Gava

3D Surface Reconstruction from 2D Multiview Images using Voxel Mapping

Some books on linear algebra

A Statistical Consistency Check for the Space Carving Algorithm.

Geometric Reconstruction Dense reconstruction of scene geometry

Image-based modeling (IBM) and image-based rendering (IBR)

Volumetric stereo with silhouette and feature constraints

CS5670: Computer Vision

EECS 442 Computer vision. Announcements

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

What have we leaned so far?

Recap: Features and filters. Recap: Grouping & fitting. Now: Multiple views 10/29/2008. Epipolar geometry & stereo vision. Why multiple views?

3D Photography: Active Ranging, Structured Light, ICP

EECS 442 Computer vision. Stereo systems. Stereo vision Rectification Correspondence problem Active stereo vision systems

Passive 3D Photography

International Journal for Research in Applied Science & Engineering Technology (IJRASET) A Review: 3D Image Reconstruction From Multiple Images

Global Non-Rigid Alignment. Benedict J. Brown Katholieke Universiteit Leuven

L2 Data Acquisition. Mechanical measurement (CMM) Structured light Range images Shape from shading Other methods

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

Stereo. 11/02/2012 CS129, Brown James Hays. Slides by Kristen Grauman

Lecture 9 & 10: Stereo Vision

Large Scale 3D Reconstruction (50 mins) Yasutaka Washington University in St. Louis

Lecture 10 Dense 3D Reconstruction

Lecture 10 Multi-view Stereo (3D Dense Reconstruction) Davide Scaramuzza

Structure from Motion

Large Scale 3D Reconstruction by Structure from Motion

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.

IMAGE-BASED RENDERING

Semantic 3D Reconstruction of Heads Supplementary Material

Fundamental matrix. Let p be a point in left image, p in right image. Epipolar relation. Epipolar mapping described by a 3x3 matrix F

Introduction à la vision artificielle X

Image-Based Modeling and Rendering

CS231A Midterm Review. Friday 5/6/2016

Modeling, Combining, and Rendering Dynamic Real-World Events From Image Sequences

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Computer Vision. 3D acquisition

3D Photography: Stereo

Stereo. Outline. Multiple views 3/29/2017. Thurs Mar 30 Kristen Grauman UT Austin. Multi-view geometry, matching, invariant features, stereo vision

Multi-view reconstruction for projector camera systems based on bundle adjustment

Other Reconstruction Techniques

EE795: Computer Vision and Intelligent Systems

Multi-View 3D-Reconstruction

3D Scanning. Qixing Huang Feb. 9 th Slide Credit: Yasutaka Furukawa

Multiview Reconstruction

Accurate 3D Face and Body Modeling from a Single Fixed Kinect

Project Updates Short lecture Volumetric Modeling +2 papers

Fast and robust techniques for 3D/2D registration and photo blending on massive point clouds

Lecture 8.2 Structure from Motion. Thomas Opsahl

3D Computer Vision. Depth Cameras. Prof. Didier Stricker. Oliver Wasenmüller

Volumetric Scene Reconstruction from Multiple Views

Lecture 14: Computer Vision

Ninio, J. and Stevens, K. A. (2000) Variations on the Hermann grid: an extinction illusion. Perception, 29,

Stereo and Epipolar geometry

Textured Mesh Surface Reconstruction of Large Buildings with Multi-View Stereo

Computer Vision at Cambridge: Reconstruction,Registration and Recognition

Multi-View Reconstruction using Narrow-Band Graph-Cuts and Surface Normal Optimization

Lecture 8 Active stereo & Volumetric stereo

An Efficient Image Matching Method for Multi-View Stereo

Stereo. Many slides adapted from Steve Seitz

Discrete Optimization of Ray Potentials for Semantic 3D Reconstruction

Shape from Silhouettes I

Live Metric 3D Reconstruction on Mobile Phones ICCV 2013

An Evaluation of Volumetric Interest Points

3D Reconstruction Of Occluded Objects From Multiple Views

Shape from Silhouettes

The SIFT (Scale Invariant Feature

Probabilistic visibility for multi-view stereo

3D Modeling of Objects Using Laser Scanning

Transcription:

Talk plan 3d shape reconstruction from photographs: a Multi-View Stereo approach Introduction Multi-View Stereo pipeline Carlos Hernández George Vogiatzis Yasutaka Furukawa Google Aston University Google Fusion of occlusion-robust depth-maps http://carlos-hernandez.org/cvpr2010/index.html Region growing Digital copy of real object 3d model Allows us to Inspect details of object Measure properties Reproduce in different material Many applications Cultural heritage preservation Computer games and movies City modelling E-commerce 3d object recognition/scene analysis Applications: cultural heritage SCULPTEUR European project 1

Applications: art Applications: structure engineering Block Works Precipitate III 2004 Mild steel blocks 80 x 46 x 66 cm BODY / SPACE / FRAME, Antony Gormley, Lelystad, Holland Domain Series Domain VIII Crouching 1999 Mild steel bar 81 x 59 x 63 cm Applications: art Applications: computer games 2

Applications: 3D indexation SCULPTEUR European project????????? medical, industrial and cultural heritage indexation 1186 fragments Applications: archaeology forma urbis romae project Fragments of the City: Stanford's Digital Forma Urbis Romae Project David Koller, Jennifer Trimble, Tina Najbjerg, Natasha Gelfand, Marc Levoy Proc. Third Williams Symposium on Classical Architecture, Journal of Roman Archaeology supplement, 2006. Applications: large scale modelling Scanning technologies Laser scanner, coordinate measuring machine Very accurate Very Expensive Complicated to use [Furukawa10] [Pollefeys08] Minolta [Cornelis08] [Goesele07] Michelangelo project Contura CMM 3

Structured light Scanning technologies 3d shape from photographs Estimate a 3d shape that would generate the input photographs given the same material, viewpoints and illumination? viewpoint geometry material illumination image [Zhang02] 3d shape from photographs Estimate a 3d shape that would generate the input photographs given the same material, viewpoints and illumination 3d shape from photographs Appearance strongly depends on the material and lighting textured Real Replica rigid deforming textureless 4

3d shape from photographs Appearance strongly depends on the material and lighting No single algorithm exists dealing with any type of scene textured rigid deforming textureless 3d shape from photographs Photograph based 3d reconstruction is: practical fast non-intrusive low cost Easily deployable outdoors low accuracy Results depend on material properties Talk plan Multi-view stereo pipeline Introduction Multi-View Stereo pipeline Fusion of occlusion-robust depth-maps Image acquisition Camera pose 3d reconstruction Region growing 5

Image acquisition Studio image acquisition Studio conditions controlled environment Uncontrolled environment hand-held unknown illumination Internet Unknown content Video small motion between frames huge amount of data Outdoor image acquisition Internet image acquisition... 6

Video image acquisition Multi-view stereo pipeline Image acquisition Camera pose 3d reconstruction Camera pose Robotic arm Robotic arm Fiduciary markers Small Scenes Structure-from-Motion Large scenes SfM from unorganized photographs 7

Fiduciary markers Structure from motion ARToolkit Bouguet s MATLAB Toolbox www.vision.caltech.edu/bouguetj/calib_doc/ Robust planar patterns Input sequence 2d features 2d track 3d points Motion estimation result Structure-from-Motion from unordered image collections [Brown05, Snavely06, Agarwal09] Image clustering Pose initialization Bundle-adjustment phototour.cs.washington.edu/bundler 8

Camera pose Multi-view stereo pipeline Robotic arm Fiduciary markers Structure-from-Motion SfM from unorganized photographs Image acquisition, 3d photo-consistency Camera pose 3d 3d reconstruction surface from camera pose from images 3d photo-consistency 3d reconstruction = 3d segmentation Photo-consistency of a 3d point Photo-consistent point Photo-consistency of a 3d point Non photo-consistent point 9

Photo-consistency of a 3d patch Challenges of photo-consistency Camera visibility Failure of comparison metric repeated texture lack of texture specularities Multi-view stereo algorithms Comparison and evaluation: A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms, S. Seitz et al., CVPR 2006, vol. 1, pages 519-526. Quick history of algorithms: Representing stereo data with the Delaunay triangulation, O. Faugeras et al., Artificial Intelligence, 44(1-2):41-87, 1990. A multiple-baseline stereo, M. Okutomi and T. Kanade, TPAMI, 15(4):353-363, 1993. Object-centered surface reconstruction: Combining multi-image stereo and shading, P. Fua, Y. Leclerc, International Journal of Computer Vision, vol. 16:35-56, 1995. A portable three-dimensional digitizer, Y. Matsumoto et al., Int. Conf. on Recent Advances in 3D Imaging and Modeling, 197-205, 1997. Photorealistic Scene Reconstruction by Voxel Coloring, S. M. Seitz and C. R. Dyer, CVPR., 1067-1073, 1997. Variational principles, surface evolution, PDE's, level set methods and the stereo problem, O. Faugeras and R. Keriven, IEEE Trans. on Image Processing, 7(3):336-344, 1998. Multi-view stereo algorithms Comparison and evaluation: A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms, S. Seitz et al., CVPR 2006, vol. 1, pages 519-526. http://vision.middlebury.edu/mview/ Recently many new algorithms Very good accuracy & completeness Almost all deal with small number of images (~100) main exception [Pollefeys08] Offline algorithms, no feedback 10

cons 5/9/2011 Best flexible algorithms Bird s-eye view: depth-map fusion Region growing Depth-map fusion 1. Compute depth hypotheses pros summary Starts from a cloud of 3d points, and grows small flat patches maximizing photo-consistency Provides best overall results due to a plane-based photoconsistency Many tunable parameters, i.e., difficult to tune to get the optimal results Fuses a set of depth-maps computed using occlusion-robust photoconsistency Elegant pipeline Plug-n-play blocks Easily parallelizable Photo-consistency metric is simple and not optimal. The metric suffers when images are not well textured or low resolution 2. Volumetrically fuse depth-maps 3. Extract 3d surface 11

12

Bird s-eye view: region growing 1. Fitting step A local surface patch is fitted, iterating visibility 2. Filter step Visibility is explicitly enforced Patch-based MVS and its Applications Why patches (oriented points)? [10 mins] Algorithmic details [30 mins] Applications [20 mins] 3. Expand step Successful patches are used to initialise active boundary 13

What is a patch? What is a patch? Patch consists of Position (x, y, z) Normal (nx, ny, nz) Extent (radius) Tangent plane approximation Patch consists of Position (x, y, z) Normal (nx, ny, nz) Extent (radius) Tangent plane approximation Mesh Normal Normal Extent Position Extent Position Patch Why patches? Why patches? Flexible Flexible 14

Why patches? Why patches? Flexible Hard to enforce regularization Flexible Hard to enforce regularization Regularization not really necessary because Local image patch is descriptive enough [Goesele et al., CVPR06], [Furukawa et al., CVPR07] 9x9 pixels Why patches? Extracts pure 3d data w/o interpolation Image Meshing w/ standard interpolation Patches vs. multiple depthmaps (could be my biased-view ) Patches Single global 3D model Depthmaps Multiple redundant 3D models Patches Clean 3D points Depthmaps Noisy without merging Patches (pure 3d data) Scene analysis from pure 3d data Meshing w/ smart interpolation [Sinha ICCV09, Furukawa ICCV09] Patches Hard to make it fast (complex algo) Depthmaps Easy to make it fast 15

Patch-based MVS and its Applications Why patches (oriented points)? [10 mins] Algorithmic details [30 mins] Applications [20 mins] Patch-based MVS [Lhuillier and Quan, PAMI 05] [Furukawa and Ponce, CVPR 07] [Habbecke and Kobbelt, CVPR 07] Patch-based MVS Patch-based MVS [Furukawa and Ponce 07] Shown to work very well Preliminaries Algorithm Multi-view stereo evaluation http://vision.middlebury.edu/mview D. Scharstein et al. Input image #1 #2 #3 16

Patch definition Patch definition Position, normal, and extent Normal Position Extent Patch p is defined by Position c(p) Normal n(p) Visible images V(p) Extent is set so that p is roughly 9x9 pixels in V(p) Normal n(p) Position c(p) Extent 9x9 pixels Visible images V(p) Photo-consistency Photo-consistency N(I, J, p) of p between two images I and J I12 Ixy: pixel color in image I Photo-consistency Photo-consistency N(I, J, p) of p between two images I and J Ixy: pixel color in image I Jxy: pixel color in image J I I11 I J J J11 J12 17

Photo-consistency Reconstruct patch p Photo-consistency N(I, J, p) of p between two images I and J I J J11 J12 Ixy: pixel color in image I Jxy: pixel color in image J Photo-consistency N(p) of p with visible images V(p) = {I1, I2,, In} Given initial estimates of Position c(p) Normal n(p) Visible images V(p) Refine c(p) and n(p) { c( p), n( p)} arg max N( p) { c( p), n( p)} c(p) n(p) Reconstruct patch p Reconstruct patch p Given initial estimates of Position c(p) Normal n(p) Visible images V(p) Refine c(p) and n(p) { c( p), n( p)} arg max N( p) { c( p), n( p)} 1 DOF Given initial estimates of Position c(p) Normal n(p) Visible images V(p) Refine c(p) and n(p) { c( p), n( p)} arg max N( p) { c( p), n( p)} 2 DOF 18

Verify a patch Textures may match by accident Verify a patch Textures may match by accident Photo-consistency must be reasonably high Verify a patch Textures may match by accident Photo-consistency must be reasonably high Verify a patch Textures may match by accident Photo-consistency must be reasonably high Verification process Keep only high photo-consistency images in V(p) Verification process Keep only high photo-consistency images in V(p) Accept if V(p) 3 19

Patch-based MVS [Furukawa and Ponce 07] Preliminaries Algorithm Patch-based MVS [Furukawa and Ponce 07] #1. Feature detection #2. Initial feature matching #3. Patch expansion and filtering Input image #1 #2 #3 Input image #1 #2 #3 Algorithm overview #1. Feature detection #2. Initial feature matching #3. Patch expansion and filtering Feature detection Extract local maxima of Harris corner detector (corners) Difference of Gaussian (blobs) Input image #1 #2 #3 20

Algorithm overview #1. Feature detection #2. Initial feature matching #3. Patch expansion and filtering Initial feature matching c(p): triangulation n(p): V(p): p Input image #1 #2 #3 Initial feature matching c(p): triangulation n(p): parallel to Image1 V(p): {Image1, Image2} p Initial feature matching c(p): triangulation n(p): parallel to Image1 V(p): {Image1, Image2} Image2, Image3} p Add visible images If N( Image1,Image3, p) 0.5 21

Initial feature matching c(p): triangulation n(p): parallel to Image1 V(p): {Image1, Image2, Image3} p Initial feature matching c(p): refine n(p): refine V(p): {Image1, Image2, Image3} { c( p), n( p)} arg max N( p) p { c( p), n( p)} Initial feature matching c(p): refine n(p): refine V(p): {Image1, Image2, Image3} { c( p), n( p)} arg max N( p) p { c( p), n( p)} Initial feature matching Verification (update V(p) and check V(p) 3) p 22

Initial feature matching Initial feature matching p Occupied Occupied Initial feature matching Initial feature matching Occupied Occupied 23

Initial feature matching Algorithm overview Repeat for all the image features Occupied #1. Feature detection #2. Initial feature matching #3. Patch expansion and filtering Input image #1 #2 #3 Patch expansion Patch expansion Pick a patch Occupied pixel 24

Patch expansion Pick a patch Patch expansion Pick a patch p Look for neighboring empty pixels All occupied Do nothing Patch expansion Pick a patch p Reconstruct a patch visible in an empty pixel Patch expansion p q c(q): {tangent plane of p intersects w/ ray} n(q): V(q): Identify neighboring empty pixels 25

Reconstruct a patch visible in an empty pixel Patch expansion p q c(q): {tangent plane of p intersects w/ ray} n(q): n(p) V(q): V(p) Reconstruct a patch visible in an empty pixel Patch expansion p c(q): refine q n(q): refine V(q): V(p) Reconstruct a patch visible in an empty pixel Patch expansion p q c(q): refine n(q): refine V(q): V(p) Patch verification! Reconstruct a patch visible in an empty pixel Patch expansion p q c(q): refine n(q): refine V(q): V(p) Patch verification! 26

Patch expansion Repeat for every patch for every neighboring empty pixel p q Visibility consistency Filter out p1 if Patch filtering 6 1 ) V ( p ) N( p1) N( p i 2 i p2 p3 p4 p5 p1 p6 Image1 Image5 Image2 Image3 Image4 Patch-based MVS [Furukawa and Ponce 07] #1. Feature detection #2. Initial feature matching #3. Patch expansion and filtering Input image #1 #2 #3 Skull - 24 images 2000x2000 [pixels] Face - 4 images 1400x2200 [pixels] (Courtesy of Industrial Light & Magic) 27

28