The Occlusion Camera

Similar documents
Three-Dimensional Display Rendering Acceleration Using Occlusion Camera Reference Images

A sufficient condition for spiral cone beam long object imaging via backprojection

Picking and Curves Week 6

CS 4204 Computer Graphics

Tdb: A Source-level Debugger for Dynamically Translated Programs

Hardware-Accelerated Free-Form Deformation

Blended Deformable Models

Real-time mean-shift based tracker for thermal vision systems

Bias of Higher Order Predictive Interpolation for Sub-pixel Registration

The Disciplined Flood Protocol in Sensor Networks

The Epipolar Occlusion Camera

Reading. 13. Texture Mapping. Non-parametric texture mapping. Texture mapping. Required. Watt, intro to Chapter 8 and intros to 8.1, 8.4, 8.6, 8.8.

Evaluating Influence Diagrams

OPTI-502 Optical Design and Instrumentation I John E. Greivenkamp Homework Set 9 Fall, 2018

Curves and Surfaces. CS 537 Interactive Computer Graphics Prof. David E. Breen Department of Computer Science

Stereo Matching and 3D Visualization for Gamma-Ray Cargo Inspection

Object Pose from a Single Image

The Depth Discontinuity Occlusion Camera

EECS 487: Interactive Computer Graphics f

The Impact of Avatar Mobility on Distributed Server Assignment for Delivering Mobile Immersive Communication Environment

Reading. 11. Texture Mapping. Texture mapping. Non-parametric texture mapping. Required. Watt, intro to Chapter 8 and intros to 8.1, 8.4, 8.6, 8.8.

arxiv: v1 [cs.cg] 27 Nov 2015

Computer-Aided Mechanical Design Using Configuration Spaces

On the Computational Complexity and Effectiveness of N-hub Shortest-Path Routing

Triangle-Free Planar Graphs as Segments Intersection Graphs

Temporal Light Field Reconstruction for Rendering Distribution Effects

Layered Light Field Reconstruction for Defocus Blur

CS 153 Design of Operating Systems Spring 18

COMPOSITION OF STABLE SET POLYHEDRA

Tu P7 15 First-arrival Traveltime Tomography with Modified Total Variation Regularization

Pipelined van Emde Boas Tree: Algorithms, Analysis, and Applications

Multi-lingual Multi-media Information Retrieval System

ABSOLUTE DEFORMATION PROFILE MEASUREMENT IN TUNNELS USING RELATIVE CONVERGENCE MEASUREMENTS

PlenoPatch: Patch-based Plenoptic Image Manipulation

PlenoPatch: Patch-based Plenoptic Image Manipulation

A Review of Image- based Rendering Techniques Nisha 1, Vijaya Goel 2 1 Department of computer science, University of Delhi, Delhi, India

CS 153 Design of Operating Systems Spring 18

Today. B-splines. B-splines. B-splines. Computergrafik. Curves NURBS Surfaces. Bilinear patch Bicubic Bézier patch Advanced surface modeling

Resolving Linkage Anomalies in Extracted Software System Models

POWER-OF-2 BOUNDARIES

Appearance Based Tracking with Background Subtraction

SZ-1.4: Significantly Improving Lossy Compression for Scientific Data Sets Based on Multidimensional Prediction and Error- Controlled Quantization

arxiv: v1 [cs.cv] 8 May 2017

Para-catadioptric camera auto-calibration from epipolar geometry

REPLICATION IN BANDWIDTH-SYMMETRIC BITTORRENT NETWORKS. M. Meulpolder, D.H.J. Epema, H.J. Sips

DIRECT AND PROGRESSIVE RECONSTRUCTION OF DUAL PHOTOGRAPHY IMAGES

Constrained Routing Between Non-Visible Vertices

An Adaptive Strategy for Maximizing Throughput in MAC layer Wireless Multicast

Rendering. Converting a 3D scene to a 2D image. Camera. Light. Rendering. View Plane

Non-Pinhole Imposters

This chapter is based on the following sources, which are all recommended reading:

The Intersection of Two Ringed Surfaces and Some Related Problems

A GENERIC MODEL OF A BASE-ISOLATED BUILDING

Pipeline Operations. CS 4620 Lecture 10

An Introduction to GPU Computing. Aaron Coutino MFCF

Image Base Rendering: An Introduction

Fast Ray Tetrahedron Intersection using Plücker Coordinates

OPTI-502 Optical Design and Instrumentation I John E. Greivenkamp Homework Set 7 Fall, 2018

Image-Based Rendering. Johns Hopkins Department of Computer Science Course : Rendering Techniques, Professor: Jonathan Cohen

Image-Based Modeling and Rendering. Image-Based Modeling and Rendering. Final projects IBMR. What we have learnt so far. What IBMR is about

DVR 630/650 Series. Video DVR 630/650 Series. 8/16-Channel real-time recording with CIF resolution. Flexible viewing with two monitor outputs

Fault Tolerance in Hypercubes

Statistical Methods in functional MRI. Standard Analysis. Data Processing Pipeline. Multiple Comparisons Problem. Multiple Comparisons Problem

Prof. Kozyrakis. 1. (10 points) Consider the following fragment of Java code:

Abstract 1 Introduction

Light Field Video Capture Using a Learning-Based Hybrid Imaging System

But, vision technology falls short. and so does graphics. Image Based Rendering. Ray. Constant radiance. time is fixed. 3D position 2D direction

New Architectures for Hierarchical Predictive Control

5 Performance Evaluation

Fast Obstacle Detection using Flow/Depth Constraint

h-vectors of PS ear-decomposable graphs

Master for Co-Simulation Using FMI

METAMODEL FOR SOFTWARE SOLUTIONS IN COMPUTED TOMOGRAPHY

Dynamic Maintenance of Majority Information in Constant Time per Update? Gudmund S. Frandsen and Sven Skyum BRICS 1 Department of Computer Science, Un

History Slicing: Assisting Code-Evolution Tasks

Isilon InsightIQ. Version 2.5. User Guide

A Thin-Client Approach for Porting OpenGL Applications to Pocket PC s

Hierarchically Accelerated Ray Casting. for Volume Rendering with Controlled Error. Allen Van Gelder Kwansik Kim Jane Wilhelms

TDT4255 Friday the 21st of October. Real world examples of pipelining? How does pipelining influence instruction

Image-based modeling (IBM) and image-based rendering (IBR)

CS 153 Design of Operating Systems Spring 18

CS 563 Advanced Topics in Computer Graphics Camera Models. by Kevin Kardian

Optimal Sampling in Compressed Sensing

Stereopsis Raul Queiroz Feitosa

Constructing Multiple Light Multicast Trees in WDM Optical Networks

Congestion-adaptive Data Collection with Accuracy Guarantee in Cyber-Physical Systems

Scanline Rendering 2 1/42

Hybrid Rendering for Collaborative, Immersive Virtual Environments

Maximal Cliques in Unit Disk Graphs: Polynomial Approximation

Subgraph Matching with Set Similarity in a Large Graph Database

TOWARD AN UNCERTAINTY PRINCIPLE FOR WEIGHTED GRAPHS

AUTOMATIC REGISTRATION FOR REPEAT-TRACK INSAR DATA PROCESSING

Minimal Edge Addition for Network Controllability

NVIDIA Case Studies:

Networks An introduction to microcomputer networking concepts

OPTI-202R Geometrical and Instrumental Optics John E. Greivenkamp Final Exam Page 1/14 Spring 2015

NETWORK PRESERVATION THROUGH A TOPOLOGY CONTROL ALGORITHM FOR WIRELESS MESH NETWORKS

EUCLIDEAN SKELETONS USING CLOSEST POINTS. Songting Luo. Leonidas J. Guibas. Hong-Kai Zhao. (Communicated by the associate editor name)

Seismic trace interpolation with approximate message passing Navid Ghadermarzy and Felix Herrmann and Özgür Yılmaz, University of British Columbia

Uncertainty Determination for Dimensional Measurements with Computed Tomography

Transcription:

EUROGRAPHICS 2005 / M. Alexa and J. Marks Volme 24 (2005, Nmber 3 (Gest Editors The Occlsion Camera Chnhi Mei, Voic Popesc, and Elisha Sacks Prde University Abstract We introdce the occlsion camera: a non-pinhole camera with 3D distorted rays. Some of the rays sample srfaces that are occlded in the reference view, while the rest sample visible srfaces. The extra samples alleviate disocclsion errors. The silhoette crves are pshed back, so nearly visible samples become visible. A single occlsion camera covers the entire silhoette of an object, whereas many depth images are reqired to achieve the same effect. Like reglar depth images, occlsion-camera images have a single layer ths the nmber of samples they contain is bonded by the image resoltion, and connectivity is defined implicitly. We constrct and se occlsion-camera images in hardware. An occlsion-camera image does not garantee that all disocclsion errors are avoided. Objects with complex geometry are rendered sing the nion of the samples stored by a planar pinhole camera and an occlsion camera depth image. Categories and Sbject Descriptors (according to ACM CCS: I.3.3. [Compter Graphics] Three-Dimensional Graphics and Realism. 1. Introdction Image-based rendering (IBR creates novel views of a 3D scene by interpolating between reference color and geometry samples. One advantage is qality: IBR algorithms attempt to transfer the high-qality of the reference images to the desired image. Another advantage is efficiency: the desired image is rendered sing reference samples, which avoids the fll complexity of the scene. In what are now more than 10 years of IBR great progress has been made. However, the fndamental challenges of IBR do not yet have complete soltions. One challenge is modeling. Gathering the reference color and geometry samples remains difficlt. For this, IBR researchers have investigated problems traditionally associated with compter vision sch as depth extraction and registration. Another challenge of IBR is inadeqate sampling. The reference samples do not always sffice for a good reconstrction of the desired image. Particlarly challenging are view dependent effects. One example is view dependent appearance de to phenomena sch as reflection, refraction, and blending at depth discontinities. Another example is view dependent geometry de to occlsions as the camera and/ scene objects move. This paper addresses the problem of disocclsion errors, which is an artifact de to lacking samples for srfaces that are visible from the desired view, bt not from the reference view. Disocclsion errors occr even for small translations of the view. The problem is challenging becase disocclsion errors are scattered in the scene. Disocclsion errors occr wherever motion parallax occrs, which, ironically, is one of the most important cles for a ser exploring a 3D scene. Crrent methods for alleviating the problem of A B C D Figre 1: When a single depth image is sed (A disocclsion errors occr (B. The occlsion-camera reference image (C stores samples for the lid, bottom and sides of the teapot, alleviating the problem of disocclsion errors (D. The Erographics Association and Blackwell Pblishing 2005. Pblished by Blackwell Pblishing, 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA.

disocclsion errors are based on sing several reference views. The approach has diminishing retrns: novel views come at the same price as the first view bt contribte fewer and fewer new samples. Or approach is based on a novel camera model, the occlsion camera, which is a non-pinhole generalization of the planar pinhole camera. A 3D radial distortion centered at an image plane point called a pole allows the occlsion camera to see arond objects (occlders along the ray defined by the pole. This way the occlsion camera also samples srfaces that are hidden from the reference view bt are close to the silhoette of the occlder and are ths likely to become disocclded in nearby views. Occlsioncamera images alleviate the disocclsion problem associated with reglar depth images (Figre 1. Like depth images, they have a single layer so the rendering cost is bonded by their resoltion and connectivity is defined implicitly by the reglar grid of samples. A planar pinhole camera acqires one sample along each ray. The 3D distortion trades (, v image resoltion for resoltion along the same ray. The occlsion camera image stores samples that are on the same ray in the corresponding planar pinhole camera image. In Figre 1 C, every lid sample L i has a corresponding body sample B i that lies on the same ray in A. The occlsion camera does not and shold not attempt to store all the samples in the scene, since, like reglar depth images, it is asked to provide view-dependent occlsion clling and level-of-detail. In Figre 1 C mch of the back of the teapot body is missing. Those samples will not become visible ntil the view changes considerably from the reference view, when a new reference image shold be sed. The occlsion camera does not garantee that all disocclsion errors are avoided. For complex scenes, the set of samples captred by the occlsion camera is not necessarily a sperset of the set of samples captred by the corresponding planar pinhole camera. For this we render sch scenes with the nion of the sets of samples captred by the two cameras. The planar pinhole and the occlsion camera images can be merged efficiently like any two depth images. The samples contribted by the occlsion camera image are samples that are almost visible in the reference view and are ths needed from nearby views. The combination prodces good reslts for considerable translations away from the reference viewpoint. The rays of the occlsion camera are non-concrrent, non-intersecting segments (Figre 2. There is exactly one ray throgh each 3D point in the field of view of the camera. These two facts imply that, like in the case of planar pinhole cameras, there is a one-to-one mapping between a scene plane and the image plane. This implies that an occlsion camera reference image can be constrcted with the feed-forward graphics pipeline, by forward projecting vertices and then backward rasterizing triangles. Occlsion-camera reference images are constrcted efficiently on the GPU. C. Mei, V. Popesc, E. Sacks / The Occlsion Camera Figre 2: Visalization of an actal occlsion camera. The left/right image shows the rays for the central/bottom row of pixels. The 3D radial distortion centered at the pole is applied between two z planes. Each ray is defined by two segments: COP to near distortion plane (ble and between distortion planes (green. The distortion brings the green segments closer to the pole ray (red. Rays that intersect the pole ray are clipped (left. The image plane can be placed anywhere between the COP and the near distortion plane. The paper is organized as follows. We review prior work in Section 0. Section 0 describes the occlsion camera model. Section 0 gives the algorithm for constrcting occlsion-camera reference images, and its GPU implementation. Section 0 describes sing occlsioncamera reference images to render novel views of the scene. Reslts and discssion conclde the paper. 2. Prior work McMillan and Bishop introdced 3D image warping, an IBR method that models and renders the scene with depth images [MB95]. The reference image is warped by projecting its depth-and-color samples onto the desired view. The desired image can be reconstrcted by splatting, a techniqe that approximates the footprint of warped samples, or by connecting the warped samples in a triangle mesh and rendering the mesh. Using a single depth image cases disocclsion errors. Warping several depth images for each view greatly increases the cost of the method. In post-rendering warping, Mark et al. investigate accelerating conventional rendering by warping two reference images [MMB97]. The reference views have to be very close to the desired view, which reqires the system to pdate the reference images several times per second. Even so disocclsion errors occr. Layered representations handle disocclsion errors by storing samples that are occlded from the reference view in additional layers at each pixel. Max proposes rendering trees sing mlti-layered z-bffers, which are bilt with a modified z-bffer algorithm [MO95]. Occlded samples are stored at deeper layers rather than being discarded. Keeping all samples along a ray generates high, neven depth complexity, and many samples are never needed in nearby images. Limiting the nmber of samples along a ray reqires a method for finding which samples are going to be visible in nearby views. The Erographics Association and Blackwell Pblishing 2005.

Figre 3: Normal (left and reverse (right planar pinhole camera image. The Layered Depth Image (LDI [SGH98, CBL99] solves the problem of choosing which samples to store by constrcting the LDI from depth images from nearby views. The reslting LDI stores the nion of the constrction images, ths each of its samples is going to be needed at least when the desired view matches the constrction view that contribted the sample. LDI constrction reqires rendering the scene mltiple times. A disadvantage common to all layered representations is that sample connectivity cannot be easily compted and layered reference images are typically sed by splatting. None of the techniqes reviewed so far garantees the absence of disocclsion errors. If a srface was not sampled by any of the depth images sed to constrct the LDI, a disocclsion error occrs. The vacm bffer [PL01] provides a conservative sample selection method for IBR. Given a desired view and a set of depth images, the techniqe finds the sb-volmes of the view frstm where disocclsion errors cold occr. The method needs the desired view which implies rnning a complex z-bffer algorithm for every frame. Also verifying all view-frstm sb-volmes that cold contain missed srfaces reqires processing many depth images. Since most of the sspected sb-volmes trn ot to be empty, conservatively checking for disocclsion errors is expensive. The disocclsion error problem is the dal of the problem of occlsion clling, which has been mch stdied in conventional rendering. The reference depth image provides an over-aggressive occlsion clling soltion for the desired image. Instead of attempting to ndo some of the clling, we propose to se a camera model that prevents disocclsion errors in the first place. Non-pinhole camera models have been stdied in compter vision for 3D imaging applications. Examples inclde the pshbroom camera [GH97], which collects rays in parallel planes by sweeping a line, and the two-slit camera [Paj02], which captres all rays passing throgh two non-coplanar lines. The pshbroom and the two-slit Figre 4: Reverse planar pinhole camera. The Erographics Association and Blackwell Pblishing 2005. Pblished by Blackwell Pblishing, 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA. cameras, as well as the pinhole and orthographic cameras, are sbclasses of the general linear camera that collects linear combinations of three rays [YM04a]. We discss sing a general linear camera to address the disocclsion error problem in the next section. Compter graphics researchers have also explored nonpinhole cameras. The light field [LH96] and lmigraph [GGS*96] are a 2D array of planar pinhole cameras. The 4D ray database is qeried dring rendering, bypassing the need for geometry. The method does not have the problem of disocclsions and spports all view dependent effects. However the database grows to impractical sizes for large scenes and large viewing volmes. Light field approaches that separate sampling geometry from sampling view dependent appearance se conventional geometric models [WAA*00], and the problem of disocclsion errors become the problem of occlsion clling. Mltiple-center-of-projection cameras [RB98] sample the scene sing a vertical slit that slides along a ser chosen path. The interactive approach provides good sampling rate and coverage control. The reslting depth-and-color images are view independent, like a model, and fail to provide occlsion clling and level-of-detail adaptation. Becase of the arbitrary path, 3D points cannot be projected directly onto the mltiple-center-of-projection image and constrcting sch an image has to be done by ray tracing. Camera models developed for mltiperspective rendering [WFH*97, YM04b] prodce images that simlate 3D animation when viewed with a constrained camera motion bt are not sitable for arbitrary views. 3. The occlsion camera model To avoid disocclsion errors we set ot to develop a camera model that has the following properties: (1 in addition to visible samples, the camera shold also collect samples close to the silhoette line of the occlder; (2 each 3D point shold project to a niqe image plane location, with a closed form projection eqation; (3 there shold be a one-to-one mapping between the plane of a scene triangle and the image plane. Properties (2 and (3 ensre images can be rendered with the new camera sing the feed forward graphics pipeline. Reverse planar pinhole camera model The first option we investigated was a simple camera model we called the reverse planar pinhole camera (Figre 4. The perspective foreshortening is reversed between the near and far planes. The camera is eqivalent to a planar pinhole camera with an opposite view, which collects the farthest sample along each ray. Figre 3 shows that the reverse planar pinhole camera has property (1 above. The image on the right half of the figre has samples that are beyond the silhoette of the

Figre 5: Reverse planar pinhole camera (left and 3D radially distorted camera image (right. bnny as seen by a reglar pinhole camera: the right side of the face, the top of the head, and the base are only visible in the right image. Sch a reference image cold be sed to render novel nearby views of the bnny, and the originally hidden samples prevent disocclsion errors. Reverse planar pinhole camera images can be easily constrcted even with the fixed graphics pipeline by sing an opposite-view camera and flipping the z-test logic, ths the reverse planar pinhole camera also has properties (2 and 3. The reverse planar pinhole camera has the advantage of great simplicity, bt has an important disadvantage: the ability of sampling hidden srfaces decreases considerably for srfaces that are close to the view direction (red pole ray in Figre 4. In Figre 5 the scene consists of two boxes, one vertical and thin, and one horizontal and thick. The reference view direction is shown with the red arrow. The reverse planar pinhole camera reference image poorly samples the left and right faces of the thin box and the top face of the thick box. Moreover, the sampling rate decreases towards the front (see line freqency on the poorly sampled faces in Figre 5. The reverse planar pinhole camera is a special case of the general linear camera which captres rays that are a linear combination of 3 rays. Any linear camera has the same problem since, given a z plane, the amont of reverse foreshortening decreases to 0 towards the center of the image. We add a forth desiderata for the occlsion camera: (4 The distortion magnitde of a 3D point shold be independent on the point s (x, y coordinates. Occlsion camera model We have developed an occlsion camera model that has the desired for properties. The problem of the reverse planar pinhole camera is avoided (Figre 5, right. We started from a planar pinhole camera. In order to see arond all sides of an occlder we have added a radial distortion centered at a pole chosen as the projection of the centroid of the occlder. The distortion plls ot samples according to their depth: farther samples are displaced by larger amonts. This way, two samples that are on the same ray in the ndistorted image move to different distorted image locations. Hidden samples that are close to the silhoette in the ndistorted image become visible. In Figre 6, the 3D radial distortion moves the projection of scene point P from P to P d. The distortion amont is given by a linear expression in 1/z, where z is the cameraspace z-coordinate of P. Larger z (smaller 1/z vales C. Mei, V. Popesc, E. Sacks / The Occlsion Camera prodce larger distortion amonts. The distortion occrs in the image plane, on a direction away from the pole. A second scene point R that occldes P in the ndistorted planar pinhole camera image is distorted less since it has a smaller z. De to the different distortions, projections P d and R d do not coincide, and the sample P is part of the occlsion camera reference image. The distortion of the occlsion camera is only apparently similar to the radial distortion characteristic to real-world lenses. The fndamental difference is that real-world lenses do not distort according to the depth of the sample; how far the ray has traveled to reach the lens is irrelevant. The occlsion camera model is defined by a planar pinhole camera PPHC that gives the reference view and a six-tple ( 0, v 0, z n, z f, d n, d f that specifies the 3D distortion. ( 0, v 0 give the image plane coordinates of the pole, (z n, z f give the near and far z planes between which the distortion is applied, and (d n, d f give the distortion magnitdes for points on the planes (z n, z f. The distortion magnitde varies linearly in 1/z between z n and z f. The pole ( 0, v 0 is typically chosen as the PPHC projection of the centroid of the occlder, z n is chosen epsilon closer than the closest point of the occlder, z f is chosen to encompass the bonding box of the scene, and d n is set to 0. d f controls how mch the silhoette of the occlder is receded. Using Figre 1 again, a larger d f, wold show more of the back of the teapot in C, and very large vales wold have all samples on the back of the teapot clear the front face. In Figre 8, d f is sfficiently large to get all red checker backgrond samples away from the cbe. To provide room for the displaced samples, the original reference image resoltion is extended by 2d f in each direction to preserve the originally intended field of view. The occlsion camera images shown in this paper and in the video have a resoltion of 720+200*2 (=1120 by 480+200*2 (=880. COP Image plane P d P pole R d C Near dist. plane Far dist plane Figre 6 The 3D radial distortion is applied between near and far distortion planes. The projection of P is displaced from P to P d. A closer sample R, seen along the same ray in the ndistorted image, is displaced less, from P to R d. A, P, and C all project at P d in the distorted image. The planar pinhole camera ray (COP, P d has segment AB replaced with AC. A z n R The Erographics Association and Blackwell Pblishing 2005. P z f B C

Projection eqation Given a 3D point P(x, y, z, between z n and z f its occlsion camera (PPHC, ( 0, v 0, z n, z f, d n, d f image plane coordinates ( d, v d are given by Eqation 1. (, v, z = PPHC( P d ( z ( d 1 zn 1 z = d n + 1 z 1 z, v d = (, v + n ( ( ( d d 0, v, v v0 d( z v Eqation 1: Occlsion camera projection. The ndistorted coordinates (, v decide the direction bt do not affect the distortion magnitde, which is completely defined by z. Points closer than z n project at their ndistorted PPHC projection. Becase of or choice of z f there are no points beyond z f. All 3D points in the view frstm of PPHC have exactly one projection onto the image plane, except for the points on the pole ray. For sch points the distortion direction is ndefined. In the limit, each of these points projects on a circle centered at the pole with a radis given by the distortion magnitde corresponding to the point s z (Figre 1 A, Figre 9. Occlsion camera rays (nprojection Using the projection eqation one can define the rays of the occlsion camera as the loci of scene points that project to the same occlsion camera image plane location. The points that project at an occlsion camera image plane location ( d, v d are fond by varying z from z n to z f. For a given z the distortion magnitde d(z is compted according to Eqation 1 and then the 3D point P is compted according to the following eqations. ( d 0 v0 (, v = ( d d( z ( d 0 v0 P = Plane( z I PPHC. Ray(, v Eqation 2: Occlsion camera ray. The distortion direction is now compted sing the known distorted coordinates. The distortion is applied in the opposite direction to find the ndistorted coordinates (, v. The point P is obtained as the point on PPHC ray (, v that is at distance z. For z s smaller than z n, the point is fond directly on the PPHC ray w/o distortion. Since 1/z is linear in screen space, and since the distortion also varies linearly in 1/z, the rays of the occlsion camera are straight between z n and z f. Using Figre 6 again and assming d n = 0, the occlsion camera ray at P d consists of segments (COP, A and (A, C. The distortion replaces (A, B with (A, C. d f eqals the length of segment (C, P d. In Eqation 2 the ndistortion is not allowed to cross the pole. The ndistortion amont cannot be larger than the The Erographics Association and Blackwell Pblishing 2005. Pblished by Blackwell Pblishing, 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA. f f 0 n 0 Figre 7: Visalization of an actal occlsion camera. distance from ( d, v d to ( 0, v 0. This effectively clips the ray segments at the pole ray, as seen in Figre 2 left. Not all ( d, v d, z triples correspond to a scene point. Figre 7 shows how the rays of the planar pinhole camera are broken at the near distortion plane. The yellow rectangle on the near distortion plane is gradally morphed into the star shape on the far distortion plane. A larger d f wold collapse the rectangle to a point. The occlsion camera sees arond an occlder along all image plane directions, ths has property (1 (see Figre 8. The occlsion camera is not a pinhole camera, since the lines of the ray segments are not concrrent. However, the occlsion camera does not sffer of projection ambigity (Eqation 1 and ths possesses property (2. This is an important difference when compared to non-pinhole cameras models sed to describe the non-zero apertre of real-world cameras and its conseqences on focs and depth of field. Property (4 follows from Eqation 1. The occlsion camera has property (3 as explained next. 4. Occlsion camera image constrction A straight forward way of constrcting an occlsion camera image is to render the scene triangles conventionally sing the planar pinhole camera PPHC and then to distort every sample created before z-bffering. This forward mapping approach has to solve the problem of maintaining srface continity. This problem has been stdied extensively in IBR ([MB95, PZB*00, PEL*00, RL00]. A possible soltion is splatting, a techniqe that replaces the point samples with srface elements that overlap, which prevents gaps. Another soltion is to connect the samples sing the connectivity implicitly defined in the ndistorted image and to rasterize the distorted mesh conventionally. 4.1. Occlsion camera triangle rasterization We avoid the difficlties of forward mapping by rasterizing Figre 8: Occlsion camera captres 5 faces of the cbe and complete backgrond (left. Disocclsion errors are avoided (middle. Right: comparison to sing depth image.

the triangles directly in the distorted domain with the following steps: - Estimate triangle bonding box in distorted domain - For each pixel ( d, v d in the bonding box - Compte ndistorted (, v and z - Zbffer ( d, v d, z - Evalate edge eqations sing (, v - Compte color c - Set ( d, v d, c Since the edges of the distorted triangle are crved, we estimate the bonding box by distorting a few points on the perimeter. The ndistorted coordinates (, v and z are compted simltaneosly according to Eqation 3. 1 (, v = A + Bv + C z d(, v = D + Ev + F ( d 0, vd v0 (, v = ( d d(, v ( d 0, vd v0 ( d 0, vd v0 (, v = ( d ( D + Ev + F ( d 0, vd v0 Eqation 3: Occlsion camera triangle rasterization. For a given triangle t, 1/z is linear in ndistorted screen space. The coefficients A, B, and C are compted by solving the linear system of 3 eqations 1/z ti = A ti + Bv ti + C, where i = 0, 1, and 2, and ( ti, v ti, z ti is the PPHC projection of vertex i of t. The 1/z expression is plgged into the distortion magnitde eqation (see Eqation 1 to compte D, E, and F. These coefficients define the linear variation of the distortion magnitde in ndistorted screen space. The ndistorted coordinates (, v are the distorted coordinates ( d, v d mins the distortion vector. The direction of the distortion vector is known since ( d, v d and ( 0, v 0 are known. The last of the for eqations is a linear system of two eqations with two nknowns and v, which once known are sed to recover z from the first eqation. Once the triangle sample is known, rasterization proceeds as sal. Edge eqations are evalated in the ndistorted domain where edges are straight. Eqation 3 provides a niqe triangle-plane point for each ( d, v d pair, except for the case in which the plane is aligned with one of the occlsion camera rays (silhoette triangle. Silhoette triangles collapse to a segment, jst like silhoette triangles do for reglar pinhole cameras. (A silhoette triangle in the ndistorted image is a triangle whose plane passes throgh the center of projection of the planar pinhole camera. The niqe soltion in the general case allows establishing a backward mapping from the distorted image plane to the triangle plane. The most expensive part of the rasterization is the 2D vector normalization reqired to compte the distortion direction at each pixel. We have implemented the rasterization algorithm described on the GPU. C. Mei, V. Popesc, E. Sacks / The Occlsion Camera Figre 9: Occlsion camera images generated on the GPU at 11fps (bnny, 70Ktris and 3fps/0.6fps (Happy Bddha, 293Ktris/1Mtris. 4.2. Hardware implementation In order to implement a novel rasterization algorithm one needs to have programmability at triangle level. Existing GPUs featre programmability only at vertex and fragment level. Vertex programs cannot access vertex data for the other two vertices shared by a triangle. For this the data of all vertices is passed to the GPU for every vertex. Another difficlty comes from the fact that the edges of the projected triangle are crved. Or soltion is to render each triangle by issing a qad drawing command which rasterizes the bonding box of the crved-edge triangle. The vertex program comptes: - the 2D vertices of the qad as the distorted-domain bonding box of a set of triangle perimeter points, - the coefficients D, E, F (see Eqation 3, - the edge eqations, - and all the other linear expressions needed for reglar rasterization (i.e. screen space or model space interpolation of textre coordinates, color, and normals. Using the parameters passed by the vertex program, the fragment program comptes the ndistorted coordinates (, v, performs the sidedness tests, shades and textres, and then retrns color and z. Figre 9 shows examples of occlsion camera images rendered with or GPU programs. 5. Rendering with occlsion-camera images Triangle mesh rendering The occlsion camera image stores a single layer of depthand-color samples. Once the occlsion camera image is constrcted we read the color and z-bffer back and bild a triangle mesh by nprojecting the samples to create 3D vertices (Eqation 2. The mesh is rendered in hardware with per-vertex color and can be re-lighted. We have also implemented a techniqe for rendering sing the occlsion camera image that avoids reading back to main memory. The reference image is transferred to GPU memory and is processed sing a vertex program that ndistorts the depth and color samples forming triangles which are rasterized to render the desired view. The Erographics Association and Blackwell Pblishing 2005.

Figre 10: Set difference on occlsion camera images. Merging An occlsion camera images is less prone to disocclsion errors than a reglar depth image since it also stores samples that are close to the silhoette as seen from the reference view. Sch samples are visible in nearby views, filling in the gaps that wold otherwise form, and extending the range of views for which the reference image is sable. Occlsion camera images do not garantee that all needed samples are present. It can happen that even some samples visible from the reference view are missing from the occlsion camera image. In the Bddha occlsion camera image shown in Figre 9 the feet of the state are not visible, althogh they are part of a reglar depth image. We avoid this problem by rendering a reglar depth image DI together with the samples of the corresponding occlsion camera image OCI that are not part of DI. We have implemented an occlsion-camera-image setdifference operator which also accepts depth images as operands since they are a special case of occlsion camera image. A sample in the first image is nprojected and the reslting 3D point is projected onto the image plane of the second image. If the z s are similar the sample is marked as shared (Figre 10. Rendering DI + (OCI DI garantees that no DI samples are lost while avoiding disocclsion errors (Figre 11, and Figre 12, and Figre 13. 6. Reslts, discssion, and ftre work We have presented a techniqe for alleviating disocclsion errors. Occlsion-camera images share the advantages of depth images: they have a single layer, which bonds the complexity (even when sed in conjnction with a depth image and provides implicit connectivity, and are constrcted and sed in hardware. The distortion parameter allows handling occlsions on a continos scale and provides an effective heristic for deciding whether a srface point is likely to be visible in nearby views. Or crrent GPU implementation constrcts occlsion camera images from geometry at the average rate of 700Ktris per second (see Figre 9. The timing data reported in this paper was measred on a Pentim 4 2GHz 1GB system with an NVIDIA GeForce 6800 256MB AGP graphics card. The models sed are cortesy of the Stanford 3D Scanning Repository [Sta05]. The occlsion camera image constrction time is dominated by the time spent in the fragment program. We will investigate speeding p or The Erographics Association and Blackwell Pblishing 2005. Pblished by Blackwell Pblishing, 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA. Figre 11: Samples contribted by the OCI are shown in pink in the middle image. implementation by deriving a tighter approximation of the bonding box of the distorted triangle. The 256MB of GPU memory crrently allow processing 1Mtris at a time. The lack of programmability at triangle level prevents s from resing vertices in the case of shared vertex meshes, and forces s to draw triangles individally. Moreover the vertex data needs to be replicated for times for each of the vertices of the qad needed to rasterize the crved-edge triangle. Removing this programmability limitation wold provide a considerable redction in memory needs. The Thai state shown in Figre 13 has 10Mtris and its occlsion camera image was constrcted on the CPU in 20 mintes. The original model is rendered by the fixed graphics pipeline at 2fps. The OCI + (OCI-DI samples are rendered at refresh rate (40fps for or system. The occlsion camera gathers samples all arond the silhoette of the occlder withot reqiring rendering the scene mltiple times, and withot decreasing the sampling rate along the main view direction. Mltiple depth image approaches, inclding LDIs, reqire rendering the scene several times. Typically an LDI is bilt from at least for reference images captred from viewpoints that box the reference viewpoint. An additional central image is needed to satisfy the sampling rate reqirement. In this paper we have considered the case of occlders. In order to address disocclsion errors occrring at the edges of a portal (e.g. a window in an architectral model, the maximm distortion shold occr on the near plane and no distortion shold occr on the far plane. The frame of the portal is enlarged in the occlsion camera reference image and more of the interior of the adjacent cell is sampled. Path prediction can also be sed to tne the occlsion camera model for improved disocclsion error avoidance. A predicted ftre view defines epipolar lines in the crrent view. All occlsions and disocclsions occr along epipolar lines. Placing the pole at the epipole generates occlsion camera images that alleviate most disocclsion errors. Or crrent approach reqires handling objects individally. In the single object case good reslts are obtained by placing the distortion pole at the projection of the center of the object. Handling complex inside-lookingot scenes by sbdivision into objects can lead to a high depth complexity. We will investigate more complex

C. Mei, V. Popesc, E. Sacks / The Occlsion Camera Figre 12: Happy Bddha model. The same occlsion camera reference image alleviates disocclsion errors on all sides of the state. occlsion camera models generated by 3D distortions controlled by more than one pole and also by line segments and crves in the image plane. The ray pattern of the occlsion camera and the reglar images prodced indicate that the approach cold probably be sed to spport several geometry processing tools sch as view independent simplification, srface parameterization and 3D morphing. 7. Acknowledgments This work was spported by the Prde University Visalization Center, and by NSF grants SCI-0417458 and CCR-0306214. We thank the anonymos reviewers. References [CBL99] C-F Chang, G. Bishop, and A. Lastra. LDI Tree: A Hierarchical Representation for Image-Based Rendering, Proc. SIGGRAPH 99, (1999. [GGS*96] S. Gortler, R. Grzeszczk, R. Szeliski, M. Cohen. The Lmigraph. Proc. of SIGGRAPH 96, 43-54. [GH97] R. Gpta, R. I. Hartley. Linear Pshbroom Cameras. IEEE Trans. Pattern Analysis and Machine Intell. vol. 19, no. 9 (1997 963 975. [LH96] M. Levoy, and P. Hanrahan. Light Field Rendering. Proc. of SIGGRAPH 96, 31-42 (1996. [MB95] L. McMillan and G. Bishop. Plenoptic modeling: An image-based rendering system. In Proc. SIGGRAPH '95, pages 39-46, 1995. Figre 13: Thai state model. [MMB97] W. Mark, L. McMillan, G. Bishop. Post- Rendering 3D Warping. Proceedings of 1997 Symposim on Interactive 3D Graphics (Providence, Rhode Island, April 27-30, 1997. [MO95] N. Max and K. Ohsaki. Rendering trees from precompted z-bffer views. In Rendering Techniqes 95: Proceedings of the Erographics Rendering Workshop 1995, 45 54, Dblin, Jne 1995. [Paj02] T. Pajdla. Geometry of Two-Slit Camera. Research Report CTU CMP 2002 02, 2002. [PEL*00] V. Popesc, J. Eyles, A. Lastra, et al. The WarpEngine: An architectre for the post-polygonal age. Proc. ACM SIGGRAPH, 2000. [PL01] V. Popesc, A. Lastra. The Vacm Bffer. In Proceedings of ACM Symposim on Interactive 3D Graphics, Chapel Hill, 2001. [PZB*00] H. Pfister, M. Zwicker, J. V. Baar, M. Gross. Srfels: Srface Elements as Rendering Primitives. Proc. of SIGGRAPH 2000, 335-342 (2000. [RB98] P. Rademacher, G. Bishop. Mltiple-center-of- Projection Images. Proc. ACM SIGGRAPH 98 (1998199 206. [RL00] S. Rsinkiewicz, M. Levoy. QSplat: A Mltiresoltion Point Rendering System for Large Meshes. Proc. SIGGRAPH 2000. [SGH98] J. Shade, S. Gortler, L. He, et al. Layered Depth Images, In Proceedings of SIGGRAPH 98, 231-242. [Sta05] The Stanford 3D Scanning Repository, http://graphics.stanford.ed/data/3dscanrep/ [WAA*00] D. N. Wood, D. I. Azma, K. Aldinger, et al. Srface light fields for 3D photography. Proceedings, SIGGRAPH 00, ACM Press, pp. 287-296. [WFH*97] D. N. Wood, A. Finkelstein, J. F. Hghes, et al. Mltiperspective Panoramas for Cel Animation. Proc. ACM SIGGRAPH 97 (1997 243-250. [YM04a] J. Y, and L. McMillan. General Linear Cameras In 8th Eropean Conference on Compter Vision (ECCV, 2004, Volme 2, 14-27. [YM04b] J. Y, and L. McMillan. A Framework for Mltiperspective Rendering. In Proceedings of Erographics Symposim on Rendering (EGSR, 2004. The Erographics Association and Blackwell Pblishing 2005.