3D Computer Vision. Dense 3D Reconstruction II. Prof. Didier Stricker. Christiano Gava

3D Computer Vision Dense 3D Reconstruction II Prof. Didier Stricker Christiano Gava Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de 1

Outline Previous lecture: dense 3D reconstruction 2-view reconstruction (triangulation, matching) Multi-view reconstruction (overview) Today: The PMVS algorithm Introduction to Variational Methods Application of Variational Methods to dense 3D reconstruction Next lecture: Structured Light I 2

The PMVS Algorithm 3

Overview of the PMVS algorithm PMVS stands for Patch-based Multi-View Stereopsis Input: set of accurately calibrated cameras Output: dense (or semi-dense) 3D point cloud 4

PMVS - motivation Furukawa, Y., Ponce, J., Accurate, Dense and Robust Multi-View Stereopsis, TPAMI, 2008 5

PMVS - Assumptions Camera calibration is available (projection matrices are known) Lambertian surfaces Perspective distortions can be approximated by affine transformations Sufficiently textured scene Depth changes smoothly 6

PMVS Building Blocks Initialization (feature detection + guided matching) Expansion Filtering 7

PMVS patch model Surface is locally approximated by a small rectangle the patch A patch is a rectangle modeled by: - Its 3D position c(p) - Its 3D normal vector n(p) - x and y axes (not shown) - A reference image R(p) The reference image is the one with the best view of the patch Size of patch p is determined by its projection on R(p) (μ μ pixels) 8

PMVS image model Images are divided into cells of β β pixels (usually β=2) Idea is to reconstruct at least 1 patch per cell Small cells high density of final point cloud 9

PMVS simplification Perspective distortions are approximated by affine transformations It samples patch projections and then compute NCC 10

PMVS Building Blocks Initialization (feature detection + guided matching) Expansion Filtering 11

PMVS - initialization Feature extraction DoG Harris corners Matching along epipolar line Create a patch: Optimize position and orientation of all patches After initialization we have a sparse reconstruction of the scene/object 12

PMVS Building Blocks Initialization (feature detection + guided matching) Expansion Filtering 13

PMVS - expansion Reconstructed (3D) points from the initialization serve as seeds for an expansion algorithm (region growing) A seed patch is expanded in the following way: The new patch will project onto a neighboring cell Position is set to the intersection of the back-projected ray and the plane of its parent patch Same orientation (normal vector is propagated) Same reference image Optimize position and orientation of new patch I 1 14

PMVS Building Blocks Initialization (feature detection + guided matching) Expansion Filtering 15

PMVS - filtering Enforces global visibility consistency 16

PMVS loop expansion and filtering The algorithm iterates between expansion and filtering until the point cloud is dense enough (3 times is sufficient according to the authors) 17

PMVS more results Furukawa, Y., Curless, B., Seitz, S. M., Szeliski, R., Towards Internet-scale Multi-view Stereo, CVPR, 2010 18

PMVS- final remarks The algorithm is a heuristic sequence of nicely engineered processing steps Works fairly well when assumptions are fulfilled Many high resolution images + small β memory allocation issues Some interesting questions Why are 3 iterations of expansion + filtering enough? Is the influence of parameters (e.g. β, μ) explicitly modeled? Is the resulting point cloud the optimal solution? If the pictures had been taken rotated by e.g. 45 o, would the result be the same? 19

Introduction to Variational Methods (based on the lectures of Prof. D. Cremers, TU München) 21

Variational Methods Variational methods are a class of optimization methods Mathematically transparent: instead of implementing a heuristic sequence of processing steps, one defines beforehand the properties the solution should have (no more cookbook recipes) Specially suitable for infinite-dimensional problems and spatially continuous representations Popular applications Dense multi-view reconstruction Image denoising/restoration Image segmentation Tracking Optical flow Motion estimation 22

Variational Methods Why are variational methods good for dense 3D reconstruction? Transparent assumptions, normally explicitly modeled Typically fewer parameters Idea is to associate an energy (or cost function) to every possible solution; then find the solution leading to the minimal cost Variational methods are easy to fuse: energies/cost functions can be simply added Allows to make statements on the existence and uniqueness of solutions by analyzing the cost functions 23

Example: Image denoising? Intput (noisy) image f (noise-free) approximation u What properties should the solution u have? It should be as similar as possible to f It should be spatially smooth Image credits: Treiber, M. A. Optimization for Computer Vision An Introduction to Core Concepts and Methods, Springer, 2013 24

Example: Image denoising u should be as similar as possible to f u should be spatially smooth, measures how close to f the approximation u is measures how noise-free the approximation u is =, + = + λ,λ>0 where = is the spatial gradient., 25

Example: Image denoising = + λ E(u) can be seen as an energy assigned to a given function u(x) These energies are known as functionals But how to minimize functionals where the argument is itself a function? This is misleading! E(u) In fact there are infinite many dimensions =0 u Nevertheless, a necessary condition to find the minimum is =0 26

The Euler-Lagrange equation = + λ This functional may be written as = L,, = For functionals of this form, a necessary condition for minimizers is = L L = 0 for differentiable E(u). The main idea behind variational methods is to find solutions of the Euler-Lagrange equation of a given functional 27

The Euler-Lagrange equation The Euler-Lagrange equation is a necessary condition; not a sufficient condition E(u) =0 For the general case of non-convex functionals, the Euler-Lagrange equation is not a sufficient condition u Is there a case in which the Euler-Lagrange equation is a sufficient condition? Yes! E(u) When the functional is convex, we can state that: The solution exists The solution is unique E(u) is convex! u 28

Example: Image denoising Now back to the image denoising example: = + λ convex convex E(u) is convex, so there is a solution u(x) that globally minimizes E(u). Moreover, this solution u(x) is unique! Image credits: Treiber, M. A. Optimization for Computer Vision An Introduction to Core Concepts and Methods, Springer, 2013 29

Choice of the smoothness term The term (also known as Tikhonov regularizer) removes noise, but tends to oversmooth sharp edges. Is there a smoothness term that removes noise but preserves edges? The term is known as Total Variation and has been introduced in 1992 by Rudin, Osher and Fatemi in the paper Nonlinear total variation based noise removal algorithms and is also referred to as the ROF model. 30

Total Variation Is it still convex? Yes! = E smoothness u' Derivative is not defined! Total Variation (and its extensions) is largely used as regularization term in approaches based on variational methods 31

Example: Image denoising Original image Noisy input f Solution based on the Solution based on the Tikhonov reg. ROF reg. 32 Image credits: Treiber, M. A. Optimization for Computer Vision An Introduction to Core Concepts and Methods, Springer, 2013

Variational Methods Interesting applications to image and video processing Strekalovskiy, E., Cremers, D., Real-Time Minimization of the Piecewise Smooth Mumford-Shah Functional, ECCV, 2014 33

Application of Variational Methods to dense 3D reconstruction 35

DTAM DTAM - Dense Tracking and Mapping in Real-time Newcombe, Lovegrove and Davison, ICCV 2011 Input : Single hand held RGB camera Goal: Dense tracking: recover camera poses Dense mapping: recover scene geometry In other words, the algorithm aims at reconstructing the dense scene geometry from estimated camera poses and to recover camera poses using the dense scene geometry Here we are interested in dense 3D reconstruction, so we will assume the camera poses are known Input image 3D dense map 36

DTAM - motivation Newcombe, R., Lovegrove S., Davison A., DTAM: Dense Tracking and Mapping in Real-Time, ICCV, 2011 37

DTAM Dense Reconstruction How is the scene represented? 38

DTAM The Cost Volume C r is called cost volume Scene geometry is modeled as an inverse depth map (ξ) For each pixel in image I r : sample the cost volume and compute a photometric error using neighboring images Then a cost can be assigned to each voxel How is the cost of each voxel computed? 39

DTAM The Photometric Error : pixel position and inverse depth hypothesis : number of images in the neighborhood of I r : photometric error between I r and I m 40

DTAM The Photometric Error Example 41

DTAM Uniform regions Inverse depth map can be computed by minimizing the photometric error (exhaustive search over the volume): Uniform (or textureless) regions are prone to false minima 42

DTAM Uniform regions Photometric error is not discriminative enough inside uniform regions Assumption: depth should vary smoothly on uniform regions 43

DTAM Variational Formulation regularization term data term pixelwise weight, linear in ξ Huber norm 44

DTAM Why the Huber norm? Total Variation (or L 1 norm): penalization of gradient magnitudes Allows sharp discontinuities in the solution Favors sparse, piecewise-constant solutions Problem Staircasing This problem can be reduced by using quadratic (or L 2 norm) penalization for small gradient magnitudes Image credits: Werlberger et. Al.: Anisotropic Huber-L1 Optical Flow, BMVC 09 45

DTAM Variation Formulation Composition of L 1 and L 2 norms Obviously convex Obviously non-convex 46

DTAM - Video Newcombe, R., Lovegrove S., Davison A., DTAM: Dense Tracking and Mapping in Real-Time, ICCV, 2011 47

Variational Methods: more examples on 3D reconstruction Shan, Q., Curless, B., Furukawa, Y., Hernandez, C., Seitz, S. M., Occluding Contours for Multi-View Stereo, CVPR, 2014 48

References Furukawa, Y., Ponce, J., Accurate, Dense and Robust Multi-View Stereopsis, TPAMI, 2008 Furukawa, Y., Curless, B., Seitz, S. M., Szeliski, R., Towards Internet-scale Multi-view Stereo, CVPR, 2010 Rudin, L. I.; Osher, S., Fatemi, E., Nonlinear total variation based noise removal algorithms, PHYSICA D, 1992 Strekalovskiy, E., Cremers, D., Real-Time Minimization of the Piecewise Smooth Mumford-Shah Functional, ECCV, 2014 Newcombe, R. A., Lovegrove, S. J., Davison, A. J., DTAM: Dense Tracking and Mapping in Real-time, ICCV, 2011 Shan, Q., Curless, B., Furukawa, Y., Hernandez, C., Seitz, S. M., Occluding Contours for Multi-View Stereo, CVPR, 2014 Lectures Variational Methods for Computer Vision and Multiple View Geometry from Prof. D. Cremers, TU München (available online) Treiber, M. A., Optimization for Computer Vision An Introduction to Core Concepts and Methods, Springer, ISBN 978-1-4471-5282-8, 2013 49

Interesting links Lectures on Variational Methods (Prof. Daniel Cremers): https://www.youtube.com/user/cvprtum/videos PMVS software version 2: http://www.di.ens.fr/pmvs/ Videos on youtube: https://www.youtube.com/watch?v=wskbathexym https://www.youtube.com/watch?v=jzm1pwfdeke https://www.youtube.com/watch?v=clywuukixra https://www.youtube.com/watch?v=agc94lcl2pm https://www.youtube.com/watch?v=ofhfor2nrxu https://www.youtube.com/watch?v=z9moptiawru https://www.youtube.com/watch?v=df9whgibcqa (DTAM) https://www.youtube.com/watch?v=ixrymyjwf3i 50

Thank you!