Intrinsic3D: High-Quality 3D Reconstruction by Joint Appearance and Geometry Optimization with Spatially-Varying Lighting

Similar documents
Intrinsic3D: High-Quality 3D Reconstruction by Joint Appearance and Geometry Optimization with Spatially-Varying Lighting

Super-Resolution Keyframe Fusion for 3D Modeling with High-Quality Textures

3D Reconstruction with Tango. Ivan Dryanovski, Google Inc.

Geometric Reconstruction Dense reconstruction of scene geometry

Efficient Online Surface Correction for Real-time Large-Scale 3D Reconstruction

Combining Depth Fusion and Photometric Stereo for Fine-Detailed 3D Models

Dense Tracking and Mapping for Autonomous Quadrocopters. Jürgen Sturm

Monocular Tracking and Reconstruction in Non-Rigid Environments

3D Editing System for Captured Real Scenes

Colored Point Cloud Registration Revisited Supplementary Material

Accurate 3D Face and Body Modeling from a Single Fixed Kinect

Deep Models for 3D Reconstruction

Deep Incremental Scene Understanding. Federico Tombari & Christian Rupprecht Technical University of Munich, Germany

Scanning and Printing Objects in 3D

Project Updates Short lecture Volumetric Modeling +2 papers

EE795: Computer Vision and Intelligent Systems

Direct Methods in Visual Odometry

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Computing 3D Geometry Directly From Range Images

Geometric and Semantic 3D Reconstruction: Part 4A: Volumetric Semantic 3D Reconstruction. CVPR 2017 Tutorial Christian Häne UC Berkeley

Outline. 1 Why we re interested in Real-Time tracking and mapping. 3 Kinect Fusion System Overview. 4 Real-time Surface Mapping

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.

Semi-Supervised Hierarchical Models for 3D Human Pose Reconstruction

Lecture 10 Dense 3D Reconstruction

Multi-View Image Coding in 3-D Space Based on 3-D Reconstruction

Augmented Reality. Sung-eui Yoon

KinectFusion: Real-Time Dense Surface Mapping and Tracking

3D Object Representations. COS 526, Fall 2016 Princeton University

Volumetric and Multi-View CNNs for Object Classification on 3D Data Supplementary Material

Scanning and Printing Objects in 3D Jürgen Sturm

Non-rigid Reconstruction with a Single Moving RGB-D Camera

Modeling, Combining, and Rendering Dynamic Real-World Events From Image Sequences

CVPR 2014 Visual SLAM Tutorial Kintinuous

Lecture 10 Multi-view Stereo (3D Dense Reconstruction) Davide Scaramuzza

Dense 3D Reconstruction. Christiano Gava

Live Metric 3D Reconstruction on Mobile Phones ICCV 2013

Video based Animation Synthesis with the Essential Graph. Adnane Boukhayma, Edmond Boyer MORPHEO INRIA Grenoble Rhône-Alpes

Signed Distance Function Representation, Tracking, and Mapping. Tanner Schmidt

Robust Articulated ICP for Real-Time Hand Tracking

Dense 3D Reconstruction. Christiano Gava

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

3D Computer Vision. Depth Cameras. Prof. Didier Stricker. Oliver Wasenmüller

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Tri-modal Human Body Segmentation

Dense 3D Reconstruction from Autonomous Quadrocopters

PERFORMANCE CAPTURE FROM SPARSE MULTI-VIEW VIDEO

Overview of 3D Object Representations

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

Application questions. Theoretical questions

Multi-view stereo. Many slides adapted from S. Seitz

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth

Step-by-Step Model Buidling

Learning to generate 3D shapes

SPURRED by the ready availability of depth sensors and

Multi-Volume High Resolution RGB-D Mapping with Dynamic Volume Placement. Michael Salvato

Other Reconstruction Techniques

Self-supervised Multi-level Face Model Learning for Monocular Reconstruction at over 250 Hz Supplemental Material

Level of Details in Computer Rendering

3D Modeling of Objects Using Laser Scanning

Synthesizing Realistic Facial Expressions from Photographs

03 - Reconstruction. Acknowledgements: Olga Sorkine-Hornung. CSCI-GA Geometric Modeling - Spring 17 - Daniele Panozzo

Peripheral drift illusion

CS 4495 Computer Vision Motion and Optic Flow

3D Reconstruction from Multi-View Stereo: Implementation Verification via Oculus Virtual Reality

Lecture 19: Depth Cameras. Visual Computing Systems CMU , Fall 2013

Self-supervised CNN for Unconstrained 3D Facial Performance Capture from an RGB-D Camera

Volumetric Scene Reconstruction from Multiple Views

Mesh from Depth Images Using GR 2 T

Augmented Reality, Advanced SLAM, Applications

Processing 3D Surface Data

Jakob Engel, Thomas Schöps, Daniel Cremers Technical University Munich. LSD-SLAM: Large-Scale Direct Monocular SLAM

Multi-stable Perception. Necker Cube

3D Shape Analysis with Multi-view Convolutional Networks. Evangelos Kalogerakis

Surface Reconstruction. Gianpaolo Palma

Multi-View 3D-Reconstruction

On-line and Off-line 3D Reconstruction for Crisis Management Applications

Outline of Presentation. Introduction to Overwatch Geospatial Software Feature Analyst and LIDAR Analyst Software

EE795: Computer Vision and Intelligent Systems

High-Fidelity Augmented Reality Interactions Hrvoje Benko Researcher, MSR Redmond

Reconstruction of complete 3D object model from multi-view range images.

Multiple View Geometry

Fast Guided Global Interpolation for Depth and. Yu Li, Dongbo Min, Minh N. Do, Jiangbo Lu

Animation of 3D surfaces.

Simultaneous Localization and Calibration: Self-Calibration of Consumer Depth Cameras

Local Readjustment for High-Resolution 3D Reconstruction: Supplementary Material

Light, Color, and Surface Reflectance. Shida Beigpour

Real-time Geometry, Albedo and Motion Reconstruction Using a Single RGBD Camera

Hybrid Textons: Modeling Surfaces with Reflectance and Geometry

Georgios (George) Papaioannou

Overview. Augmented reality and applications Marker-based augmented reality. Camera model. Binary markers Textured planar markers

Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision

L2 Data Acquisition. Mechanical measurement (CMM) Structured light Range images Shape from shading Other methods

Stereo. 11/02/2012 CS129, Brown James Hays. Slides by Kristen Grauman

3D Modeling from Multiple Images

VIDEO FOR VIRTUAL REALITY LIGHT FIELD BASICS JAMES TOMPKIN

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Geometric Modeling. Bing-Yu Chen National Taiwan University The University of Tokyo

Recap: Features and filters. Recap: Grouping & fitting. Now: Multiple views 10/29/2008. Epipolar geometry & stereo vision. Why multiple views?

Dense 3D Modelling and Monocular Reconstruction of Deformable Objects

Chapter 7. Conclusions and Future Work

Transcription:

Intrinsic3D: High-Quality 3D Reconstruction by Joint Appearance and Geometry Optimization with Spatially-Varying Lighting R. Maier 1,2, K. Kim 1, D. Cremers 2, J. Kautz 1, M. Nießner 2,3 Fusion Ours 1 2 NVIDIA TU Munich 3 Stanford University International Conference on Computer Vision 2017

Overview Motivation & State-of-the-art Approach Results Conclusion 2

Overview Motivation & State-of-the-art Approach Results Conclusion 3

Motivation Recent progress in Augmented Reality (AR) / Virtual Reality (VR) Microsoft HoloLens HTC Vive 4

Motivation Recent progress in Augmented Reality (AR) / Virtual Reality (VR) Requirement of high-quality 3D content for AR, VR, Gaming Microsoft HoloLens HTC Vive NVIDIA VR Funhouse 5

Motivation Recent progress in Augmented Reality (AR) / Virtual Reality (VR) Requirement of high-quality 3D content for AR, VR, Gaming Usually: manual modelling (e.g. Maya) Microsoft HoloLens HTC Vive NVIDIA VR Funhouse 6

Motivation Recent progress in Augmented Reality (AR) / Virtual Reality (VR) Requirement of high-quality 3D content for AR, VR, Gaming Usually: manual modelling (e.g. Maya) Wide availability of commodity RGB-D sensors: efficient methods for 3D reconstruction Microsoft HoloLens HTC Vive NVIDIA VR Funhouse Asus Xtion 7

Motivation Recent progress in Augmented Reality (AR) / Virtual Reality (VR) Requirement of high-quality 3D content for AR, VR, Gaming Usually: manual modelling (e.g. Maya) Wide availability of commodity RGB-D sensors: efficient methods for 3D reconstruction Microsoft HoloLens HTC Vive Challenge: how to reconstruct high-quality 3D models with best-possible geometry and color from low-cost depth sensors? NVIDIA VR Funhouse Asus Xtion 8

State-of-the-art RGB-D based 3D Reconstruction Goal: stream of RGB-D frames of a scene 3D shape that maximizes the geometric consistency 9

State-of-the-art RGB-D based 3D Reconstruction Goal: stream of RGB-D frames of a scene 3D shape that maximizes the geometric consistency Real-time, robust, fairly accurate geometric reconstructions KinectFusion, 2011 KinectFusion: Real-time Dense Surface Mapping and Tracking, Newcombe et al., ISMAR 2011. 10

State-of-the-art RGB-D based 3D Reconstruction Goal: stream of RGB-D frames of a scene 3D shape that maximizes the geometric consistency Real-time, robust, fairly accurate geometric reconstructions KinectFusion, 2011 KinectFusion: Real-time Dense Surface Mapping and Tracking, Newcombe et al., ISMAR 2011. DynamicFusion, 2015 DynamicFusion: Reconstruction and Tracking of Non-rigid Scenes in Real-time, Newcombe et al., CVPR 2015. 11

State-of-the-art RGB-D based 3D Reconstruction Goal: stream of RGB-D frames of a scene 3D shape that maximizes the geometric consistency Real-time, robust, fairly accurate geometric reconstructions KinectFusion, 2011 KinectFusion: Real-time Dense Surface Mapping and Tracking, Newcombe et al., ISMAR 2011. DynamicFusion, 2015 DynamicFusion: Reconstruction and Tracking of Non-rigid Scenes in Real-time, Newcombe et al., CVPR 2015. BundleFusion, 2017 BundleFusion: Real-time Globally Consistent 3D Reconstruction using On-the-fly Surface Re-integration, Dai et al., ToG 2017. 12

State-of-the-art Voxel Hashing Baseline RGB-D based 3D reconstruction framework initial camera poses sparse SDF reconstruction Real-time 3D Reconstruction at Scale using Voxel Hashing, Nießner et al., ToG 2013. 13

State-of-the-art Voxel Hashing Baseline RGB-D based 3D reconstruction framework initial camera poses sparse SDF reconstruction Challenges: (Slightly) inaccurate and over-smoothed geometry Real-time 3D Reconstruction at Scale using Voxel Hashing, Nießner et al., ToG 2013. 14

State-of-the-art Voxel Hashing Baseline RGB-D based 3D reconstruction framework initial camera poses sparse SDF reconstruction Challenges: (Slightly) inaccurate and over-smoothed geometry Bad colors Real-time 3D Reconstruction at Scale using Voxel Hashing, Nießner et al., ToG 2013. 15

State-of-the-art Voxel Hashing Baseline RGB-D based 3D reconstruction framework initial camera poses sparse SDF reconstruction Challenges: (Slightly) inaccurate and over-smoothed geometry Bad colors Inaccurate camera pose estimation Real-time 3D Reconstruction at Scale using Voxel Hashing, Nießner et al., ToG 2013. 16

State-of-the-art Voxel Hashing Baseline RGB-D based 3D reconstruction framework initial camera poses sparse SDF reconstruction Challenges: (Slightly) inaccurate and over-smoothed geometry Bad colors Inaccurate camera pose estimation Input data quality (e.g. motion blur, sensor noise) Real-time 3D Reconstruction at Scale using Voxel Hashing, Nießner et al., ToG 2013. 17

State-of-the-art Voxel Hashing Baseline RGB-D based 3D reconstruction framework initial camera poses sparse SDF reconstruction Challenges: (Slightly) inaccurate and over-smoothed geometry Bad colors Inaccurate camera pose estimation Input data quality (e.g. motion blur, sensor noise) Real-time 3D Reconstruction at Scale using Voxel Hashing, Nießner et al., ToG 2013. 18

State-of-the-art Voxel Hashing Baseline RGB-D based 3D reconstruction framework initial camera poses sparse SDF reconstruction Challenges: (Slightly) inaccurate and over-smoothed geometry Bad colors Inaccurate camera pose estimation Input data quality (e.g. motion blur, sensor noise) Goal: High-Quality Reconstruction of Geometry and Color Real-time 3D Reconstruction at Scale using Voxel Hashing, Nießner et al., ToG 2013. 19

State-of-the-art 20

State-of-the-art High-Quality Colors [Zhou2014] Optimize camera poses and image deformations to optimally fit initial (maybe wrong) reconstruction But: HQ images required, no geometry refinement involved Color Map Optimization for 3D Reconstruction with Consumer Depth Cameras, Zhou and Koltun, ToG 2014 21

State-of-the-art High-Quality Colors [Zhou2014] High-Quality Geometry [Zollhöfer2015] Optimize camera poses and image deformations to optimally fit initial (maybe wrong) reconstruction But: HQ images required, no geometry refinement involved Color Map Optimization for 3D Reconstruction with Consumer Depth Cameras, Zhou and Koltun, ToG 2014 Adjust camera poses in advance (bundle adjustment) to improve color Use shading cues (RGB) to refine geometry (shading based refinement of surface & albedo) But: RGB is fixed (no color refinement based on refined geometry) Shading-based Refinement on Volumetric Signed Distance Functions, Zollhöfer et al., ToG 2015 22

State-of-the-art High-Quality Colors [Zhou2014] High-Quality Geometry [Zollhöfer2015] Optimize camera poses and image deformations to optimally fit initial (maybe wrong) reconstruction But: HQ images required, no geometry refinement involved Color Map Optimization for 3D Reconstruction with Consumer Depth Cameras, Zhou and Koltun, ToG 2014 Adjust camera poses in advance (bundle adjustment) to improve color Use shading cues (RGB) to refine geometry (shading based refinement of surface & albedo) But: RGB is fixed (no color refinement based on refined geometry) Shading-based Refinement on Volumetric Signed Distance Functions, Zollhöfer et al., ToG 2015 Idea: jointly optimize for geometry, albedo and image formation model to simultaneously obtain high-quality geometry and appearance! 23

Our Method 24

Our Method Temporal view sampling & filtering techniques (input frames) 25

Our Method Temporal view sampling & filtering techniques (input frames) Joint optimization of surface & albedo (Signed Distance Field) image formation model 26

Our Method Temporal view sampling & filtering techniques (input frames) Joint optimization of surface & albedo (Signed Distance Field) image formation model 27

Our Method Temporal view sampling & filtering techniques (input frames) Joint optimization of surface & albedo (Signed Distance Field) image formation model Lighting estimation using Spatially- Varying Spherical Harmonics (SVSH) 28

Our Method Temporal view sampling & filtering techniques (input frames) Joint optimization of surface & albedo (Signed Distance Field) image formation model Lighting estimation using Spatially- Varying Spherical Harmonics (SVSH) Optimized colors (by-product) 29

Overview Motivation & State-of-the-art Approach Results Conclusion 30

Approach Overview RGB-D 31

Approach Overview RGB-D SDF Fusion 32

Approach Overview RGB-D SDF Fusion Shading-based Refinement (Shape-from-Shading) 33

Approach Overview RGB-D SDF Fusion Shading-based Refinement (Shape-from-Shading) Temporal view sampling / filtering 34

Approach Overview RGB-D SDF Fusion Shading-based Refinement (Shape-from-Shading) Spatially-Varying Lighting Estimation Temporal view sampling / filtering 35

Approach Overview RGB-D SDF Fusion Shading-based Refinement (Shape-from-Shading) Spatially-Varying Lighting Estimation Joint Appearance and Geometry Optimization surface albedo image formation model Temporal view sampling / filtering 36

Approach Overview RGB-D SDF Fusion Shading-based Refinement (Shape-from-Shading) High-Quality 3D Reconstruction Spatially-Varying Lighting Estimation Joint Appearance and Geometry Optimization surface albedo image formation model Temporal view sampling / filtering 37

Approach Overview RGB-D 38

RGB-D Data Example: Fountain dataset 1086 RGB-D frames Sensor: Depth 640x480px Color 1280x1024px ~10 Hz Primesense Poses estimated using Voxel Hashing Source: http://qianyi.info/scenedata.html 39

Approach Overview SDF Fusion 40

Signed Distance Fields Volumetric 3D model representation Voxel grid: dense (e.g. KinectFusion) or sparse (e.g. Voxel Hashing) A volumetric method for building complex models from range images, Curless and Levoy, SIGGRAPH 1996. 41

Signed Distance Fields Volumetric 3D model representation Voxel grid: dense (e.g. KinectFusion) or sparse (e.g. Voxel Hashing) Each voxel stores: Signed Distance Function (SDF): signed distance to closest surface Color values Weights A volumetric method for building complex models from range images, Curless and Levoy, SIGGRAPH 1996. 42

Signed Distance Fields Fusion of depth maps Integrate depth maps into SDF with their estimated camera poses 43

Signed Distance Fields Fusion of depth maps Integrate depth maps into SDF with their estimated camera poses Voxel updates using weighted average 44

Signed Distance Fields Fusion of depth maps Integrate depth maps into SDF with their estimated camera poses Voxel updates using weighted average 45

Signed Distance Fields Fusion of depth maps Integrate depth maps into SDF with their estimated camera poses Voxel updates using weighted average 46

Signed Distance Fields Fusion of depth maps Integrate depth maps into SDF with their estimated camera poses Voxel updates using weighted average 47

Signed Distance Fields Fusion of depth maps Integrate depth maps into SDF with their estimated camera poses Voxel updates using weighted average Extract ISO-surface with Marching Cubes (triangle mesh) Marching cubes: A high resolution 3D surface construction algorithm, Lorensen and Cline, SIGGRAPH 1987. 48

Approach Overview Temporal view sampling / filtering 49

Keyframe Selection Compute per-frame blur score (for color image) Frame 81 Frame 84 Select frame with best score within a fixed size window as keyframe The blur effect: perception and estimation with a new no-reference perceptual blur metric, Crete et al., SPIE 2007. 50

Sampling / Filtering Sampling of voxel observations Sample from selected keyframes only Input keyframes Collect observations for voxel in input views: Reconstruction 51

Sampling / Filtering Sampling of voxel observations Sample from selected keyframes only Input keyframes Collect observations for voxel in input views: Voxel center transformed and projected into input view Reconstruction 52

Sampling / Filtering Sampling of voxel observations Sample from selected keyframes only Input keyframes Collect observations for voxel in input views: Voxel center transformed and projected into input view Observation weights: view-dependent on normal and depth Filter observations: keep only best 5 observations by weight Reconstruction 53

Approach Overview Double-hierarchical (coarse-to-fine: SDF Volume / RGB-D) Shading-based Refinement (Shape-from-Shading) Spatially-Varying Lighting Estimation Joint Appearance and Geometry Optimization surface albedo image formation model 54

Shape-from-Shading Shading equation: 55

Shape-from-Shading Shading equation: surface normal 56

Shape-from-Shading Shading equation: lighting surface normal 57

Shape-from-Shading Shading equation: albedo lighting surface normal 58

Shape-from-Shading Shading equation: Shading albedo lighting surface normal 59

Shape-from-Shading Shading equation: Shading albedo lighting surface normal Shading-based refinement: Intuition: high-frequency changes in surface geometry shading cues in input images 60

Shape-from-Shading Shading equation: Shading albedo lighting surface normal Shading-based refinement: Intuition: high-frequency changes in surface geometry shading cues in input images Estimate lighting given surface and albedo (intrinsic material properties) 61

Shape-from-Shading Shading equation: Shading albedo lighting surface normal - Shading-based refinement: Intuition: high-frequency changes in surface geometry shading cues in input images Estimate lighting given surface and albedo (intrinsic material properties) Estimate surface and albedo given the lighting: minimize difference between estimated shading and input luminance 62

Approach Overview Spatially-Varying Lighting Estimation 63

Lighting Estimation Spherical Harmonics (SH) Encode incident lighting for a given surface point Smooth for Lambertian surfaces 64

Lighting Estimation Spherical Harmonics (SH) Encode incident lighting for a given surface point Smooth for Lambertian surfaces SH Basis functions H m parametrized by unit normal n 65

Lighting Estimation Spherical Harmonics (SH) Encode incident lighting for a given surface point Smooth for Lambertian surfaces SH Basis functions H m parametrized by unit normal n Good approx. using only 9 SH basis functions (2nd order) 66

Lighting Estimation Spherical Harmonics (SH) Encode incident lighting for a given surface point Smooth for Lambertian surfaces SH Basis functions H m parametrized by unit normal n Good approx. using only 9 SH basis functions (2nd order) Estimate SH coefficients: 67

Lighting Estimation Spherical Harmonics (SH) Encode incident lighting for a given surface point Smooth for Lambertian surfaces SH Basis functions H m parametrized by unit normal n Good approx. using only 9 SH basis functions (2nd order) Estimate SH coefficients: Shortcoming: purely directional cannot represent scene lighting for all surface points simultaneously 68

Spatially-Varying Lighting Subvolume Partitioning 69

Spatially-Varying Lighting Subvolume Partitioning Partition SDF volume into subvolumes 70

Spatially-Varying Lighting Subvolume Partitioning Partition SDF volume into subvolumes Estimate independent SH coefficients for each subvolume 71

Spatially-Varying Lighting Subvolume Partitioning Partition SDF volume into subvolumes Estimate independent SH coefficients for each subvolume Obtain per-voxel SH coefficients through tri-linear interpolation 72

Spatially-Varying Lighting Joint Optimization 73

Spatially-Varying Lighting Joint Optimization Estimate SVSH coefficients for all subvolumes jointly: 74

Spatially-Varying Lighting Joint Optimization Estimate SVSH coefficients for all subvolumes jointly: Data term: Similarity between estimated shading and input luminance 75

Spatially-Varying Lighting Joint Optimization Estimate SVSH coefficients for all subvolumes jointly: Data term: Similarity between estimated shading and input luminance Laplacian regularizer: Smooth illumination changes 76

Approach Overview Joint Appearance and Geometry Optimization surface albedo image formation model 77

Joint Optimization Shading-based SDF optimization Joint optimization of geometry, albedo and image formation model (camera poses and camera intrinsics): with 78

Joint Optimization Shading-based SDF optimization Joint optimization of geometry, albedo and image formation model (camera poses and camera intrinsics): with Gradient-based shading constraint (data term) 79

Joint Optimization Shading-based SDF optimization Joint optimization of geometry, albedo and image formation model (camera poses and camera intrinsics): with Gradient-based shading constraint (data term) Volumetric regularizer: smoothness in distance values (Laplacian) 80

Joint Optimization Shading-based SDF optimization Joint optimization of geometry, albedo and image formation model (camera poses and camera intrinsics): with Gradient-based shading constraint (data term) Volumetric regularizer: smoothness in distance values (Laplacian) Surface Stabilization constraint: stay close to initial distance values 81

Joint Optimization Shading-based SDF optimization Joint optimization of geometry, albedo and image formation model (camera poses and camera intrinsics): with Gradient-based shading constraint (data term) Volumetric regularizer: smoothness in distance values (Laplacian) Surface Stabilization constraint: stay close to initial distance values Albedo regularizer: constrain albedo changes based on chromaticity (Laplacian) 82

Joint Optimization Shading Constraint (data term) Idea: maximize consistency between estimated voxel shading and sampled intensities in input luminance images (gradient for robustness) 83

Joint Optimization Shading Constraint (data term) Idea: maximize consistency between estimated voxel shading and sampled intensities in input luminance images (gradient for robustness) Best views for voxel and respective view-dependent weights 84

Joint Optimization Shading Constraint (data term) Idea: maximize consistency between estimated voxel shading and sampled intensities in input luminance images (gradient for robustness) Best views for voxel and respective view-dependent weights Shading: allows for optimization of surface (through normal) and albedo 85

Joint Optimization Shading Constraint (data term) Idea: maximize consistency between estimated voxel shading and sampled intensities in input luminance images (gradient for robustness) Best views for voxel and respective view-dependent weights Shading: allows for optimization of surface (through normal) and albedo Voxel center transformed and projected into input view 86

Joint Optimization Shading Constraint (data term) Idea: maximize consistency between estimated voxel shading and sampled intensities in input luminance images (gradient for robustness) Best views for voxel and respective view-dependent weights Shading: allows for optimization of surface (through normal) and albedo Voxel center transformed and projected into input view Sampling: allows for optimization of camera poses and camera intrinsics 87

Recolorization Optimal colors Recompute voxel colors after optimization at each level 88

Recolorization Optimal colors Recompute voxel colors after optimization at each level Sampling Sample from keyframes only Collect, weight and filter observations 89

Recolorization Optimal colors Recompute voxel colors after optimization at each level Sampling Sample from keyframes only Collect, weight and filter observations Weighted average of observations: 90

Overview Motivation & State-of-the-art Approach Results Conclusion 91

Ground Truth: Geometry Frog (synthetic) Ours Fusion Ground truth 92

Ground Truth: Geometry Frog (synthetic) Ours Fusion Ground truth Zollhöfer et al. 15 93

Ground Truth: Geometry Frog (synthetic) Ours Fusion Ground truth Zollhöfer et al. 15 Ours 94

Ground Truth: Quantitative Results Frog (synthetic) Zollhöfer et al. 15 Generated synthetic RGB-D dataset (noise on depth and camera poses) Quantitative surface accuracy evaluation Color coding: absolute distances (ground truth) 95

Ground Truth: Quantitative Results Frog (synthetic) Generated synthetic RGB-D dataset (noise on depth and camera poses) Quantitative surface accuracy evaluation Color coding: absolute distances (ground truth) Ours Zollhöfer et al. 15 96

Ground Truth: Quantitative Results Frog (synthetic) Generated synthetic RGB-D dataset (noise on depth and camera poses) Quantitative surface accuracy evaluation Color coding: absolute distances (ground truth) Ours Zollhöfer et al. 15 Mean absolute deviation: Ours: 0.222mm (std.dev. 0.269mm) Zollhöfer et al: 0.278mm (std.dev. 0.299mm) 20.14% more accurate 97

Qualitative Results Relief (geometry) Fusion Input Color Ours Zollhöfer et al. 15 Ours 98

Qualitative Results Fountain (appearance) Fusion Input Color Ours Zollhöfer et al. 15 Ours 99

Qualitative Results Lion Input Color Geometry (ours) Appearance (ours) Fusion Ours Fusion Ours 100

Qualitative Results Tomb Statuary Input Color Geometry (ours) Appearance (ours) Fusion Ours Fusion Ours 101

Qualitative Results Gate Input Color Geometry (ours) Appearance (ours) Fusion Ours Fusion Ours 102

Qualitative Results Hieroglyphics Input Color Geometry (ours) Appearance (ours) Fusion Ours Fusion Ours 103

Qualitative Results Bricks Input Color Geometry (ours) Appearance (ours) Ours Fusion 104

Shading: Global SH vs. SVSH Fountain Luminance 105

Shading: Global SH vs. SVSH Fountain Luminance Albedo 106

Shading: Global SH vs. SVSH Fountain Global SH Luminance Shading Difference Albedo 107

Shading: Global SH vs. SVSH Fountain Global SH Luminance Shading Difference SVSH Albedo Shading Difference 108

Overview Motivation & State-of-the-art Approach Results Conclusion 109

Conclusion High-Quality 3D Reconstruction of Geometry and Appearance Temporal view sampling & filtering techniques Spatially-Varying Lighting estimation Joint optimization of surface & albedo (SDF) and image formation model Optimized texture as by-product 110

Conclusion High-Quality 3D Reconstruction of Geometry and Appearance Temporal view sampling & filtering techniques Spatially-Varying Lighting estimation Joint optimization of surface & albedo (SDF) and image formation model Optimized texture as by-product Thank you! Questions? Robert Maier Technical University of Munich Computer Vision Group robert.maier@in.tum.de https://vision.in.tum.de/members/maierr 111