Real-time Generation and Presentation of View-dependent Binocular Stereo Images Using a Sequence of Omnidirectional Images

Similar documents
Construction of an Immersive Mixed Environment Using an Omnidirectional Stereo Image Sensor

A Review of Image- based Rendering Techniques Nisha 1, Vijaya Goel 2 1 Department of computer science, University of Delhi, Delhi, India

Super Wide Viewing for Tele-operation

But, vision technology falls short. and so does graphics. Image Based Rendering. Ray. Constant radiance. time is fixed. 3D position 2D direction

Morphable 3D-Mosaics: a Hybrid Framework for Photorealistic Walkthroughs of Large Natural Environments

A Networked Telepresence System with an Internet Automobile and its Transmission Issue

Realtime Omnidirectional Stereo for Obstacle Detection and Tracking in Dynamic Environments

Image-Based Rendering. Johns Hopkins Department of Computer Science Course : Rendering Techniques, Professor: Jonathan Cohen

Image-Based Rendering. Johns Hopkins Department of Computer Science Course : Rendering Techniques, Professor: Jonathan Cohen

Image-Based Rendering

Modeling Light. Michal Havlik : Computational Photography Alexei Efros, CMU, Fall 2007

Real Time Rendering. CS 563 Advanced Topics in Computer Graphics. Songxiang Gu Jan, 31, 2005

Computational Photography

A Stereo Vision-based Mixed Reality System with Natural Feature Point Tracking

Image-Based Modeling and Rendering

Image Base Rendering: An Introduction

Image-Based Rendering. Image-Based Rendering

Image-based modeling (IBM) and image-based rendering (IBR)

Rendering by Manifold Hopping

Networked Telepresence System Using Web Browsers and Omni-directional Video Streams

Non-isotropic Omnidirectional Imaging System for an Autonomous Mobile Robot

Automatic Disparity Control in Stereo Panoramas (OmniStereo) Λ

Capturing and View-Dependent Rendering of Billboard Models

Optimizing an Inverse Warper

Image-Based Modeling and Rendering

Development of Low-Cost Compact Omnidirectional Vision Sensors and their applications

FAST ALGORITHM FOR CREATING IMAGE-BASED STEREO IMAGES

Image Based Rendering

Omnidirectional image-based modeling: three approaches to approximated plenoptic representations

A System for Visualization and Summarization of Omnidirectional Surveillance Video

Light Fields. Johns Hopkins Department of Computer Science Course : Rendering Techniques, Professor: Jonathan Cohen

Hybrid Rendering for Collaborative, Immersive Virtual Environments

A Warping-based Refinement of Lumigraphs

PAPER Three-Dimensional Scene Walkthrough System Using Multiple Acentric Panorama View (APV) Technique

Image-Based Modeling and Rendering. Image-Based Modeling and Rendering. Final projects IBMR. What we have learnt so far. What IBMR is about

3-D Shape Reconstruction from Light Fields Using Voxel Back-Projection

Ego-Mot ion and Omnidirectional Cameras*

CS 684 Fall 2005 Image-based Modeling and Rendering. Ruigang Yang

Ray-based approach to Integrated 3D Visual Communication

More and More on Light Fields. Last Lecture

The Light Field and Image-Based Rendering

Remote Reality Demonstration

Multi-View Omni-Directional Imaging

Modeling Light. Michal Havlik

Multiple View Geometry

IMAGE-BASED RENDERING TECHNIQUES FOR APPLICATION IN VIRTUAL ENVIRONMENTS

Modeling Light. Slides from Alexei A. Efros and others

Image or Object? Is this real?

1 Introduction. 2 Real-time Omnidirectional Stereo

Mosaics, Plenoptic Function, and Light Field Rendering. Last Lecture

1-2 Feature-Based Image Mosaicing

Lecture 15: Image-Based Rendering and the Light Field. Kayvon Fatahalian CMU : Graphics and Imaging Architectures (Fall 2011)

Modeling Light. Michal Havlik : Computational Photography Alexei Efros, CMU, Fall 2011

Efficient panoramic sampling of real-world environments for image-based stereoscopic telepresence

DECODING COMPLEXITY CONSTRAINED RATE-DISTORTION OPTIMIZATION FOR THE COMPRESSION OF CONCENTRIC MOSAICS

Real-Time Video- Based Modeling and Rendering of 3D Scenes

Image-based rendering using plane-sweeping modelisation

Tour Into the Picture using a Vanishing Line and its Extension to Panoramic Images

CONVERSION OF FREE-VIEWPOINT 3D MULTI-VIEW VIDEO FOR STEREOSCOPIC DISPLAYS

VIDEO FOR VIRTUAL REALITY LIGHT FIELD BASICS JAMES TOMPKIN

Plenoptic Stitching: A Scalable Method for Reconstructing 3D Interactive Walkthroughs

CSc Topics in Computer Graphics 3D Photography

The Plenoptic videos: Capturing, Rendering and Compression. Chan, SC; Ng, KT; Gan, ZF; Chan, KL; Shum, HY.

Modeling Light. On Simulating the Visual Experience

Forward Image Mapping

OmniStereo: Panoramic Stereo Imaging Λ

Light Field Spring

Efficient View-Dependent Image-Based Rendering with Projective Texture-Mapping

Image-based Modeling and Rendering: 8. Image Transformation and Panorama

High-Quality Interactive Lumigraph Rendering Through Warping

Stereo vision. Many slides adapted from Steve Seitz

Camera Calibration for Video See-Through Head-Mounted Display. Abstract. 1.0 Introduction. Mike Bajura July 7, 1993

Proc. Int. Symp. Robotics, Mechatronics and Manufacturing Systems 92 pp , Kobe, Japan, September 1992

Stereo pairs from linear morphing

Modeling Light. Michal Havlik

Forward Image Warping

Acquisition and Visualization of Colored 3D Objects

Live Video Integration for High Presence Virtual World

zspace Developer SDK Guide - Introduction Version 1.0 Rev 1.0

View Generation for Free Viewpoint Video System

DD2423 Image Analysis and Computer Vision IMAGE FORMATION. Computational Vision and Active Perception School of Computer Science and Communication

Outdoor Scene Reconstruction from Multiple Image Sequences Captured by a Hand-held Video Camera

LAG CAMERA: A MOVING MULTI-CAMERA ARRAY

Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations

Midterm Examination CS 534: Computational Photography

3D Environment Measurement Using Binocular Stereo and Motion Stereo by Mobile Robot with Omnidirectional Stereo Camera

Automatically Synthesising Virtual Viewpoints by Trinocular Image Interpolation

Image-Based Rendering Methods

DEPTH ESTIMATION USING STEREO FISH-EYE LENSES

Rate-distortion Optimized Streaming of Compressed Light Fields with Multiple Representations

Natural Viewing 3D Display

A unified approach for motion analysis and view synthesis Λ

AR Cultural Heritage Reconstruction Based on Feature Landmark Database Constructed by Using Omnidirectional Range Sensor

Behavior Learning for a Mobile Robot with Omnidirectional Vision Enhanced by an Active Zoom Mechanism

Shape as a Perturbation to Projective Mapping

Stereo with Mirrors*

Real-Time Video-Based Rendering from Multiple Cameras

Smoothing Region Boundaries in Variable Depth Mapping for Real Time Stereoscopic Images

Occlusion Detection of Real Objects using Contour Based Stereo Matching

Model-Based Stereo. Chapter Motivation. The modeling system described in Chapter 5 allows the user to create a basic model of a

Transcription:

Real-time Generation and Presentation of View-dependent Binocular Stereo Images Using a Sequence of Omnidirectional Images Abstract This paper presents a new method to generate and present arbitrarily directional binocular stereo images from a sequence of omnidirectional images in real time. A sequence of omnidirectional images is taken by moving an omnidirectional image sensor in a static real environment. The motion of the omnidirectional image sensor is constrained to a plane. The sensor s route and speed are known. In the proposed method, a fixed length of the sequence is buffered in a computer to generate arbitrarily directional binocular stereo images using a subset of plenoptic function. Using the method a user can look around a scene in the distance with rich 3-D sensation without significant time delay. This paper describes a principle of real-time generation of binocular stereo images. In addition, we introduce a prototype system of view-dependent stereo image generation and presentation. 1. Introduction In recent years, the acquisition of a remote scene for telepresence or surveillance has become of great interest [1]. The important issue in telepresence is to produce the wide field of view and a stereoscopic effect. A telepresence system that expands the field of view has been developed by Onoe et al. [8, 9]. This system uses an omnidirectional image sensor, and generates view-dependent images without significant time delay. However the generated images are monoscopic. One of approaches to produce both the wide field of view and a stereoscopic effect is to use rotating stereo cameras. But this approach has a problem in terms of time delay from the change of viewing direction to the change of displayed image caused by rotating remote stereo cameras. Recently, image-based rendering techniques have been proposed to generate a virtual environment from a set of photographs. QuickTime VR [3] is a system that uses these techniques. The user can alter viewing direction, but the viewpoint is fixed. A typical approach to relax this fixed position constraint is the plenoptic function [2, 6]. The Lightfield [5] and the Lumigraph [4] defined as a 4D plenoptic function make it possible to render images of a scene from any arbitrary viewpoint in a bounding box without finding depth values. Using the plenoptic functions an arbitrarily directional binocular stereo images can be obtained. However, such methods require a lot of images to render arbitrary views. In this paper we propose a new method to generate an arbitrarily directional binocular stereo images from a sequence of omnidirectional images in real time by using a subset of plenoptic function. A sequence of omnidirectional images is taken by moving an omnidirectional image sensor in a static real environment. We constrain the motion of the omnidirectional image sensor to a plane and the sensor s route and speed are known. Compared with the Lightfield and the Lumigraph, the proposed method has smaller data size, as we constrain the viewpoint to the proximity of the sensor s route. This paper is structured as follows. We first briefly describe the video-rate omnidirectional image sensor in Section 2. Real-time generation of arbitrarily directional binocular stereo images is then discussed in Section 3. We describe a prototype system of view-dependent stereo images generation and presentation, as well as experimental results in Section 4. We finally summarize the present work in Section 5. 2. Omnidirectional Image Sensor We employ HyperOmni Vision [11] as an omnidirectional image sensor in our study. HyperOmni Vision uses a hyperboloidal mirror and a standard CCD video camera as shown in Figure 1(a). The geometry of HyperOmni Vision is illustrated in Figure 1(b). The hyperboloidal mirror has two focal points, an inner focal point O M and an outer focal point O C. The optical center of the camera lens is placed at O C. Given a world coordinate (X; Y; Z) as shown Figure 1(b), the hyperboloidal mirror surface is represented as follows: X 2 + Y 2 a 2 Z2 = 1 (Z >0): (1) b2

3. Real-time Generation of Novel Binocular Views 3.1. Rendering a Novel View (a) Appearance (b) Geometrical configuration Figure 1. HyperOmni Vision ver.2a. A sequence of omnidirectional images that is taken by HyperOmni Vision moving straight in a static scene is used. Route and speed that HyperOmni Vision moves on a straight line are known. Given a sequence of omnidirectional images, we can render a novel view at an arbitrary point located in the plane on which a sequence of omnidirectional images is taken. The inner focal point O M of the mirror is at (0; 0;c) and the outer focal point O C is at (0; 0; c) in the world coordinate, where c = p a 2 + b 2. A ray going toward the inner focal point O M is reflected by the mirror and is focused on the outer focal point O C. Thus, we can get a central projection of 360-degree filed of view onto the hyperboloidal surface by the CCD camera. A point P(X; Y; Z) in world coordinate is projected to a point p(x; y) on an image plane. This projection is represented as: x = y = Xf(b 2 c 2 ) (b 2 + c 2 )(Z c) 2bc p X 2 + Y 2 +(Z c) 2 ; Yf(b 2 c 2 ) (b 2 + c 2 )(Z c) 2bc p X 2 + Y 2 +(Z c) 2 : An omnidirectional image taken by HyperOmni Vision is shown in Figure 2(a). The generation of perspective or panoramic images from the omnidirectional image is relatively easy, because the projection is a central projection and can be converted by a simple mapping [8]. Figure 2(b) is a perspective image computed form the omnidirectional image shown in Figure 2(a). This transformation is performed by calculating correspondences of all points in the perspective image to points in the omnidirectional image. (a) (b) Figure 2. Omnidirectional image (a) and perspective image (b). (2) (a) top view (b) side view Figure 3. Illustration of generating novel views. The basic idea of novel view generation is illustrated in Figure 3. HyperOmni Vision moving from the point X to the point Y takes a sequence of omnidirectional images. Let us describe how to generate a perspective image of W by H pixels from the viewpoint A. The generation is done by computing all pixels on the image plane. For example, when computing pixels on the vertical line that includes the point P, pixels are taken from the omnidirectional image captured at the point Q where the line PA and the line XY intersect, because the ray AP is the same as the ray QP captured at the point Q [4, 5]. The vertical line that includes the point P is rendered by generating perspective image of 1 by H pixels from the omnidirectional image captured at the point Q and scaling the resulting image to H pixels. It should be noted in this case that the focal length is jqpj and the vertical field of view is the same as the vertical field of view with which the vertical line of H pixels is generated from the point A. Similarly, we compute the entire image of viewpoint A by using omnidirectional images obtained from the point R to the point S. This method causes some vertical distortions in the image generated. If the distances between the image sensor and objects in a scene are known, we can reduce the vertical

distortion. However, we actually do not know the distance and cannot correct the distortion. The vertical distortion decreases as the distance between the objects in a scene and the sensor increases [10]. A sequence of omnidirectional images is captured at video-rate. This means that the omnidirectional images on the line XY are obtained at discrete points. Thus, there is a case that omnidirectional image required to generate a novel view does not exists. In such a case, we generate a novel view by using an omnidirectional image of the nearest neighbor. 3.2. Generation of Stereo Images We generate arbitrarily directional binocular stereo images from a sequence of omnidirectional images by using the method described in the previous section. Here, we assume that both left and right eyes are located on the plane where a sequence of images is taken. Images for left and right eyes can be computed by specifying two different viewpoints with a certain distance. 3.3. Reduction of Correspondence Calculation In order to generate binocular stereo images by using the proposed method, we must calculate the correspondence between stereo images and omnidirectional images. Such a computation is time consuming and is difficult to be performed in real time. We have developed the following method to achieve real-time generation of stereo images. Omnidirectional images exist at discrete points on the route of HyperOmni Vision, say T 1 to T 5 in Figure 4. An image to be generated can be vertically divided according to omnidirectional images used for rendering the region. In Figure 4, a perspective image is generated from three omnidirectional images captured at T 2,T 3 and T 4 and can be vertically divided into three areas. We space a grid in every area as shown in Figure 4 and compute the exact transformation only for grid points. Stereo images are efficiently computed from clipped omnidirectional images and grid points by using an image warping technique with bi-linear interpolation [7]. The flow of an efficient stereo image generation algorithm is as follows. 1. The destination stereo images are vertically divided according to the number of omnidirectional images used. 2. A m n grid is spaced in every divided area, where m = d W w 16e, n = 12. (w is a width of area, and W is a width of computed image.) 3. The exact transformation is computed for every grid point. Figure 4. Division image and spacing a grid. 4. Hardware texture mapping generates the rectangular area in stereo images. We can efficiently generate a pair of binocular stereo images of viewing direction by using this algorithm. 4. Prototype System for Immersive Telepresence We have implemented the proposed method and constructed a prototype system of stereo images generation and presentation. The prototype system generates binocular stereo images from a sequence of temporarily recorded omnidirectional images in real time and continuously presents binocular stereo images of user s arbitrary viewing direction. 4.1. System Overview The prototype system configuration is illustrated in Figure 5. In the system we have used a graphics workstation (SGI Onyx2 IR2 (MIPS R10000 195MHz, 16CPU)) with a video board (DIVO), HyperOmni Vision ver.2a, a robot (Nomad-200), a 3D magnetic tracker (POLHEMUS 3SPACE FASTRAK) and a head mounted display (HMD) (OLYMPUS Mediamask). The flow of process in the system is as follows. 1. A robot with HyperOmni Vision is moved straight at constant speed by the workstation. The latest sequence of omnidirectional images is buffered in a workstation. 2. Viewing direction of a user is sent to a workstation from a 3D magnetic tracker attached to the HMD. 3. Binocular stereo images of user s viewing direction are generated in real time and are displayed on the HMD. The process of generating binocular stereo images is executed using a single CPU of the workstation.

In our experiment of binocular stereo image generation and presentation, the robot with HyperOmni Vision is moved straight at a constant speed of 17.5 cm per second. A sequence of omnidirectional images for the latest two second is buffered in the workstation memory. The system generates binocular stereo images that the center of eyes is located at the point of acquiring the central omnidirectional image among buffered images in the workstation. Therefore, a user looks around a scene from the viewpoint at the point that the robot passed one second ago. The distance between eyes for binocular stereo images is set to 7cm. The horizontal field of view of the image is 60 degree. The computed images are of 640 480 pixels. Figure 6 shows a sampled input sequence of omnidirectional images in the period of 35 second. Generated binocular stereo images are shown in Figure 7. In Figure 7, the binocular parallax can be clearly observed between left and right images of stereo pair. The parallax is easily recognized by watching a chair, which is closer to the sensor, in stereo images (a), (b) and (c) of Figure 7. Vertical distortions in stereo images are also observed. However it has been proven that a user can look around the scene with rich 3-D sensation without any sense of incompatibility. A pair of stereo images is computed in 0.017 second. Images displayed on HMD are updated every 0.033 second (video-rate). In the present setup, we can generate stereo images when a user looks at an angle of from 40 to 140 degree and from 40 to 140 degree against the direction towards which the robot is moving. It should be noted that the shortage of omnidirectional images buffered in the workstation causes the fact that we cannot generate stereo images when viewing direction is close to the direction that robot is moving. The number of omnidirectional images used with respect to viewing direction is illustrated in Figure 8. The vertical axis in Figure 8 shows the number of omnidirectional images required for generating each of stereo pair. The number of omnidirectional images used for viewing direction from 40 to 140 degree is same as that for the direction from 40 to 140 degree as shown in Figure 8. Figure 5. Hardware cofiguration of prototype system. 4.2. Experiments Figure 6. An input sequence of omnidirectional images (arrows show moving direction of sensor). 5. Conclusions In this paper we have proposed a new method to generate novel binocular stereo views in arbitrary direction from a sequence of omnidirectional images in real time. We have constructed the prototype of stereo images generation and presentation system by using the proposed method. In our experiments, a user could look around a real scene in the distance and well perceive parallax in generated binocular stereo images.

(a) Figure 8. The number of omnidirectional images used with respect to viewing direction. (b) The future work is to generate stereo images in all directions and to compensate vertical distortions. References left image right image Figure 7. A sequence of generated binocular stereo images in the period of 35 sec. (viewing directions are (a) 52, (b) 96, (c) 122, (d) 56, (e) 81, (f) 118 degree from the moving direction, respectively). (c) (d) (e) (f) [1] Special issue on immersive telepresence. IEEE MultiMedia, 4(1):17 56, 1997. [2] E. H. Adelson and J. Bergen. The plenoptic function and the elements of early vision. Computational Models of Visual Processing, pages 29 38, 1991. [3] S. E. Chen. QuickTime VR - An image-based approach to virtual environment navigation. Proc. SIGGRAPH 95, pages 29 38, August 1995. [4] S. J. Gortler, R. Grzeszczuk, R. Szeliski, and M. F. Cohen. The lumigraph. Proc. SIGGRAPH 96, pages 43 54, August 1996. [5] M. Levoy and P. Hanrahan. Light field rendering. Proc. SIGGRAPH 96, pages 31 42, August 1996. [6] L. McMillan and G. Bishop. Plenoptic modeling: An imagebased rendering system. Proc. SIGGRAPH 95, pages 39 46, August 1995. [7] J. Neider, T. Davis, and M. Woo. OpenGL Programming Guide. Addison-Wesley Publishing Company, 1993. [8] Y. Onoe, K. Yamazawa, H. Takemura, and N. Yokoya. Telepresence by real-time viewdependent image generation from omnidirectional video streams. Computer Vision and Image Understanding, 71(2):154 165, August 1998. [9] Y. Onoe, K. Yamazawa, H. Takemura, and N. Yokoya. Visual surveillance and monitoring system using an omnidirectional video camera. Proc. 14th IAPR Int. Conf. on Pettern Recognition(14ICPR), I:588 592, August 1998. [10] K. Yamaguchi, K. Yamazawa, H. Takemura, and N. Yokoya. Realtime generation and presentation of arbitrarily directional binocular stereo images using a sequence of omnidirectional images. IEICE Tech. Report, PRMU99-159, pages 67 72, November 1999. (in Japanese). [11] K. Yamazawa, Y. Yagi, and M. Yachida. Obstacle detection with omnidirectional image sensor hyperomni vision. Proc. 1995 IEEE Int. Conf. on Robotics and Automation, pages 1062 1067, May 1995.