Visualization 2D-to-3D Photo Rendering for 3D Displays

Similar documents
2D-to-3D Photo Rendering for 3D Displays

arxiv: v1 [cs.cv] 28 Sep 2018

Stereo and Epipolar geometry

Stereo Image Rectification for Simple Panoramic Image Generation

Multiple View Geometry

arxiv: v1 [cs.cv] 28 Sep 2018

Stereo Vision. MAN-522 Computer Vision

Multiple View Geometry in Computer Vision Second Edition

Single-view metrology

Research on an Adaptive Terrain Reconstruction of Sequence Images in Deep Space Exploration

Step-by-Step Model Buidling

A virtual tour of free viewpoint rendering

EECS 442 Computer vision. Stereo systems. Stereo vision Rectification Correspondence problem Active stereo vision systems

Lecture 9: Epipolar Geometry

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth

Course Overview: Lectures

12/3/2009. What is Computer Vision? Applications. Application: Assisted driving Pedestrian and car detection. Application: Improving online search

calibrated coordinates Linear transformation pixel coordinates

Image Rectification (Stereo) (New book: 7.2.1, old book: 11.1)

Fast Outlier Rejection by Using Parallax-Based Rigidity Constraint for Epipolar Geometry Estimation

Image Transfer Methods. Satya Prakash Mallick Jan 28 th, 2003

URBAN STRUCTURE ESTIMATION USING PARALLEL AND ORTHOGONAL LINES

Model-Based Stereo. Chapter Motivation. The modeling system described in Chapter 5 allows the user to create a basic model of a

1D camera geometry and Its application to circular motion estimation. Creative Commons: Attribution 3.0 Hong Kong License

BIL Computer Vision Apr 16, 2014

Rectification and Distortion Correction

Fundamentals of Stereo Vision Michael Bleyer LVA Stereo Vision

Model Refinement from Planar Parallax

1 Projective Geometry

Camera Geometry II. COS 429 Princeton University

Structure from Motion. Prof. Marco Marcon

CS 4495 Computer Vision A. Bobick. Motion and Optic Flow. Stereo Matching

Epipolar Geometry Based On Line Similarity

Dense 3D Reconstruction. Christiano Gava

arxiv: v4 [cs.cv] 8 Jan 2017

Structure from motion

Homographies and RANSAC

Epipolar Geometry and Stereo Vision

A Desktop 3D Scanner Exploiting Rotation and Visual Rectification of Laser Profiles

Epipolar Geometry Prof. D. Stricker. With slides from A. Zisserman, S. Lazebnik, Seitz

Feature Transfer and Matching in Disparate Stereo Views through the use of Plane Homographies

Chaplin, Modern Times, 1936

More Single View Geometry

Multiple Views Geometry

Application questions. Theoretical questions

Data-driven Depth Inference from a Single Still Image

Viewpoint Invariant Features from Single Images Using 3D Geometry

A Summary of Projective Geometry

Today. Stereo (two view) reconstruction. Multiview geometry. Today. Multiview geometry. Computational Photography

55:148 Digital Image Processing Chapter 11 3D Vision, Geometry

Comments on Consistent Depth Maps Recovery from a Video Sequence

Computational Optical Imaging - Optique Numerique. -- Multiple View Geometry and Stereo --

Rectification and Disparity

3D Reconstruction from Two Views

Announcements. Stereo

There are many cues in monocular vision which suggests that vision in stereo starts very early from two similar 2D images. Lets see a few...

Dense 3D Reconstruction. Christiano Gava

Rectification for Any Epipolar Geometry

Structure from Motion. Introduction to Computer Vision CSE 152 Lecture 10

Synthesis of Novel Views of Moving Objects in Airborne Video

Segmentation Based Stereo. Michael Bleyer LVA Stereo Vision

Accurate and Dense Wide-Baseline Stereo Matching Using SW-POC

Single-view 3D Reconstruction

Segmentation and Tracking of Partial Planar Templates

Geometry of Multiple views

More Single View Geometry

Epipolar Geometry and Stereo Vision

3D-TV Content Creation: Automatic 2D-to-3D Video Conversion Liang Zhang, Senior Member, IEEE, Carlos Vázquez, Member, IEEE, and Sebastian Knorr

first order approx. u+d second order approx. (S)

Two-view geometry Computer Vision Spring 2018, Lecture 10

Structure from motion

Epipolar Geometry and Stereo Vision

3D Reconstruction from Scene Knowledge

Computer Vision I. Announcements. Random Dot Stereograms. Stereo III. CSE252A Lecture 16

Lecture 10: Multi-view geometry

Inexpensive Construction of a 3D Face Model from Stereo Images

A REAL-TIME TRACKING SYSTEM COMBINING TEMPLATE-BASED AND FEATURE-BASED APPROACHES

An Algorithm for Seamless Image Stitching and Its Application

But First: Multi-View Projective Geometry

Passive 3D Photography

Video Mosaics for Virtual Environments, R. Szeliski. Review by: Christopher Rasmussen

Accurate Motion Estimation and High-Precision 3D Reconstruction by Sensor Fusion

Using Shape Priors to Regularize Intermediate Views in Wide-Baseline Image-Based Rendering

Lecture 10: Multi view geometry

Announcements. Stereo

Epipolar Geometry and Stereo Vision

Computer Vision Lecture 17

Fundamental Matrices from Moving Objects Using Line Motion Barcodes

Vision-Based Registration for Augmented Reality with Integration of Arbitrary Multiple Planes

Computer Vision Lecture 17

CSCI 5980/8980: Assignment #4. Fundamental Matrix

Stereo. 11/02/2012 CS129, Brown James Hays. Slides by Kristen Grauman

Scene Modeling for a Single View

COMPARATIVE STUDY OF DIFFERENT APPROACHES FOR EFFICIENT RECTIFICATION UNDER GENERAL MOTION

Unit 3 Multiple View Geometry

Efficient Object Shape Recovery via Slicing Planes

Automatic Photo Popup

Observations. Basic iteration Line estimated from 2 inliers

Multiple View Geometry in Computer Vision

Transcription:

Visualization 2D-to-3D Photo Rendering for 3D Displays Sumit K Chauhan 1, Divyesh R Bajpai 2, Vatsal H Shah 3 1 Information Technology, Birla Vishvakarma mahavidhyalaya,sumitskc51@gmail.com 2 Information Technology, Birla Vishvakarma mahavidhyalaya,divyeshbajpai@yahoocom 3 Assistant Professor, Birla Vishvakarma mahavidhyalaya,vatsal.shah@bvmengineering.com ABSTRACT- We describe a computationally fast and effective approach to 2D-3D conversion of an image pair for the three dimensional rendering on stereoscopic displays of scenes including a ground plane. The stereo disparities of all the other scene elements (background, foreground objects) are computed after statistical segmentation and geometric localization of the ground plane. The disparity maps generated with our approach are accurate enough to provide users with a stunning 3D impression. I. INTRODUCTION The recent advent of commercial 3D screens and visualization devices has renewed the interest in computer vision techniques for 2D-to-3D conversion. The appeal of 2D-to-3D conversion is two-fold. First, direct production of 3D media contents through specialised capturing equipment such as a stereoscopic video camera is still quite expensive. Second, a facility for converting monocular videos to stereo format would support the full 3D visualization of already existing contents, such as vintage movies. A stereoscopic camera system consists of a pair of cameras producing a stereo (left and right) pair of images. Different disparities (i.e., shifts of corresponding scene points in the left and right visual channels) are interpreted by the human brain as corresponding variations of scene depth. In this paper, we describe a practical and effective approach to 2D-3D conversion of an image pair, under the basic assumption that the scene contains a ground plane. Once such a plane is first segmented in the images by a statistical algorithm [4], the rest of the scene elements (background and foreground objects, the latter segmented in a semi-automatic way) can be acquired and rendered. In particular, the background is modelled as a vertical ruled surface following the ground boundary deformation, whereas foreground objects are flat, vertical layers, standing upon the ground plane. The paper is organized as follows. The next section discusses all the theoretical aspects of the approach, from geometric modelling and estimation (subsect. A) to image segmentation and stereoscopic rendering (subsect. B). In third sec. experimental results are presented and discussed. Finally, conclusions and directions for future work are provided in fourth sec. II. THE APPROACH Given an uncalibrated monocular view I of a 3D scene, our goal is to synthesize the corresponding stereo pair (Il, Ir) for a virtual stereoscopic system by exploiting a second view J of the same scene. The cameras corresponding to the actual views are placed in general position and are therefore not in a stereoscopic system configuration. I and J are referred to as reference and support images, respectively. The role of I and J can be swapped, so each of them can be rendered on a 3D TV screen. By exploiting the support image, epipolar geometry estimation and camera self-calibration are first carried out. Automatic ground segmentation then allows recovering the homography induced by the ground plane on the two actual views. @IJAERD-2014, All rights Reserved 1

A. Geometric Estimation We now develop the theory related to warping the image of the ground plane onto the stereoscopic pair Il and Ir. The theory being actually general, in the following we will refer to any given planar region π in the scene. Explicitly, the homography warping Iπ onto the right view Ir is H r = K i (I sn t /d π ) K i^ 1, (1) Where Ki is the calibration matrix for view I, and s = [δ/2 0 0], δ being the baseline between the virtual cameras. The homography Hl for the left view has the same form, but s = [ δ/2 0 0]. The plane orientation nπ can be computed as n π = K i^ti l PI, (2) Where i l π is the vanishing line of π in image I. The signed distance dπ can be obtained by triangulating any two corresponding points under the homography Hπ and imposing the passage of π through the triangulated 3D point. The infinite homography H = KjRKi^-1. The homography Hp = Hπ^-1 H (3) mapping I onto itself is actually a planar homology, i.e., a special planar transformation having a line of fixed points (the axis) and a distinct fixed point (the vertex), not on the line. Given the fundamental matrix F between two views, the three-parameter family of homographies induced by a world plane π is H π = A j e i v t, (4) Where [ j e i ] A = F is any decomposition (up to scale) of the fundamental matrix, and j e i is the epipole of view I in image J (in other words, j e i F = 0).The matrix A can be chosen as A = [ j e i ] F. (5) Both the fundamental matrix F and the ground plane homography Hπ are robustly computed by running the RANSAC algorithm [10] on SIFT correspondences [5]. A.1. Camera Self-Calibration Camera self-calibration follows the approach of [6], where the fundamental matrix F between I and J is exploited. In our notation, F is defined as j x T f i x = 0, (6) For any two corresponding points i x I and j x J. In [6], the internal camera matrices K i and K j are estimated by forcing the matrix E = K j^t FK i to have the same properties of the essential matrix. This is achieved by minimizing the difference between the two non zero singular values of E, since they must be equal. The Levenberg-Marquardt algorithm is used, so an initial guess for K i and K j is required. @IJAERD-2014, All rights Reserved 2 Figure 1. Reference image I. Support image J.

B. Stereo pair generation and rendering This section specializes the use of Eq. 1 to the case of a scene including a planar ground, and then expounds how to warp the background and foreground objects properly, given the image of the ground plane. Figure 1 shows the images I and J that will be used to illustrate the various rendering phases. B.1. Ground Plane Virtual View Generation The ground plane is segmented in the images I and J by exploiting the classification algorithm proposed in [4]. Figure 2 Ground classification for image I: The brighter the color, the more probable the ground region. Recovery of the ground plane vanishing line (dashed line in the picture), after camera self-calibration and ground plane homography estimation. Figure 3. The two virtual views for the ground plane of image I. Il. Ir. Fig. 2 shows the ground plane classification for the reference image of Figure 1. Figure 2 shows the computed vanishing line for the ground plane in the reference image I, after camera self-calibration and the computation of the ground plane homography Hπ have been performed. @IJAERD-2014, All rights Reserved 3

Figure 4 Background generation for Ir. Top border of the background not occluded. Recovery of the background for the occluded part of the ground top border. The resulting two virtual views (Il, Ir) of the ground plane are shown in Figure. 4. B.2. Background Generation Given the warped ground plane, the background of the scene is generated in a column-wise way. This is a direct consequence of modelling the background as a ruled surface perpendicular to the ground plane. For each point p of the top border of the ground in I, the corresponding point in Ir and Il is recovered. Figure 4 shows an example of occlusion by a foreground object (the statue). Figure 4 shows that background data have been correctly filled in. B.3. Displacement Of The Foreground Objects Foreground objects are segmented in a semi-automatic way with the GrabCut tool [7]. They are rendered in the images as flat and frontal objects, since the texture of the object is usually sufficient to provide the user with the impression of local depth variation due to the object s shape. Figure 5 shows the final stereo pair ( and ), the two images superimposed (c) and the disparity map (d). B.4. Stereoscopic Display On A 3d Screen For a parallel camera stereoscopic system, points at infinity have zero disparity, and appear to the user to be on the screen surface when the images are displayed on a 3D TV without modification. When a limited range [ a, b] for disparity is introduced, the nearest and furthest points are associated with the extreme values of that range. Hence the zero disparity plane is not located anymore at infinity, but is frontal to the cameras, in a region between the nearest and furthest points. @IJAERD-2014, All rights Reserved 4

(c) (d) Figure 5 (c) (d) @IJAERD-2014, All rights Reserved 5

(e) Figure 6 (f) III. EXPERIMENTAL RESULTS In figure. 6 and are shown the reference and support images of the horse pair together with their associated ground plane vanishing lines. Figure 6(c) and (d) show the obtained stereoscopic pair, featuring coincident vanishing lines. Notice at the bottom left (c) and right (d) corners the black (i.e., empty) regions arising after warping the original images. Figure 6(e) shows the resulting disparity map. Fig. 6(f) shows a front-to-parallel view of the ground plane. Figs. 7 and illustrate the bushes pair, where two partially self-occluding foreground objects are present. Notice, from both Fig 7(c) and (d), the small blurred regions especially evident to the left (c) and right (d) of the closer bush due to color interpolation inside occluded background areas. (c) (d) @IJAERD-2014, All rights Reserved 6

(e) (f) Figure 7 The bushes example. Reference image I. Support image J. (c) Left stereoscopic image Il. (d) Right stereoscopic image Ir. (e) Disparity map with our approach. (f) Disparity map with a dense stereo approach. The good quality of disparity map obtained with our approach is confirmed by a visual comparison against the disparity map of Figure. 7(f), which was obtained with a state-of-the-art dense stereo approach [9]. As evident from the disparity map of Figure 7(e), the two bushes are correctly rendered as belonging to two distinct depth layers. IV. CONCLUSIONS AND FUTURE WORK We have described and discussed a simple yet fast and effective approach to 2D-3D conversion of an image pair for parallel stereoscopic displays, where the disparities of all scene elements are generated after statistical segmentation and geometric localization of the ground plane in the scene. Future work will address (1) extending the approach to videos (which will lead to investigate the problem of temporal consistency among frames), (2) relaxing the ground plane assumption (3) implementing an automatic method to determine the optimal range of disparities for 3D perception. V. REFERENCES [1] S. Coren, L. M.Ward, and J. T. Enns. Sensation and Perception. Harcourt Brace, 1993. [2] A. Criminisi, M. Kemp, and A. Zisserman. Bringing pictorial space to life: computer techniques for the analysis of paintings. In on-line Proc. Computers and the History of Art, 2002. [3] M. Guttman, L. Wolf, and D. Cohen-Or. Semi-automatic stereo extraction from video footage. In Proc. IEEE International Conference on Computer Vision, 2009. [4] D. Hoiem, A. Efros, and M. Hebert. Recovering surface layout from an image. International Journal on Computer Vision, 75(1), 2007. [5] D. Lowe. Distinctive image features from scale-invariant keypoints. International Journal on Computer Vision, 60(2):91 110, 2004. [6] P. Mendonc a and R. Cipolla. A simple technique for selfcalibration. In Proc. Conf. Computer Vision and Pattern Recognition, 1999. [7] C. Rother, V. Kolmogorov, and A. Blake. Grabcut: Interactive foreground extraction using iterated graph cut s. In ACM Transactions on Graphics (SIGGRAPH), 2004. [8] R. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, 2004. [9] O. Woodford, P. Torr, I. Reid, and A. Fitzgibbon. Global stereo reconstruction un der second-order smoothness priors. IEEE Trans. on Pattern Analysis and Machine Intelligence, 31(12):2115 2128, 2009. [10] M. Fischler and R. Bolles. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Comm. of the ACM, 24(6):381 395, 1981. @IJAERD-2014, All rights Reserved 7