A Computer Vision Approach for 3D Reconstruction of Urban Environments

Size: px
Start display at page:

Download "A Computer Vision Approach for 3D Reconstruction of Urban Environments"

Transcription

1 A Computer Vision Approach for 3D Reconstruction of Urban Environments P. Mordohai 1, J.-M. Frahm 1, A. Akbarzadeh 2, B. Clipp 1, C. Engels 2, D. Gallup 1, P. Merrell 1, C. Salmi 1, S. Sinha 1, B. Talton 1, L. Wang 2, Q. Yang 2, H. Stewenius 2, H. Towles 1, G. Welch 1, R. Yang 2, M. Pollefeys 1 and D. Nistér 2 1 University of North Carolina, Chapel Hill, NC, USA {mordohai, jmf, bclipp, gallup, pmerrell, salmi, taltonb, herman, welch, marc}@cs.unc.edu 2 University of Kentucky, Lexington, KY, USA {amir, engels, qingxiong.yang, stewe}@vis.uky.edu, {lwangd, ryang, dnister}@cs.uky.edu Abstract We present an approach for automatic reconstruction of detailed 3D models of outdoor scenes using computer vision techniques. Our system collects video, GPS and INS data which are processed in real-time to produce geo-registered 3D models that represent the geometry and appearance of the world. These models are generated without manual measurements or markers in the scene and can be used for visualization from arbitrary viewpoints, documentation and archiving of large areas. Furthermore, virtual objects and buildings can be inserted in them to generate previews of potential alterations to the environment. The GPS/INS measurements allow us to geo-register the pose of the camera at the time each frame was captured. Several offset frames of the same part of the scene serve as input to the processing pipeline which estimates the positions of points in the world from pixel correspondences and generates the 3D models. These models are in the form of a triangular mesh and a set of images that provide texture for the mesh. 1. INTRODUCTION Detailed, 3D models of cities were usually made from aerial data, in the form of range or passive images combined with other modalities, such as measurements from a Global Positioning System (GPS). While these models can be useful for navigation, they provide little additional information compared to maps in terms of visualization. Buildings and other landmarks cannot be easily recognized since the facades are poorly reconstructed from aerial images due to highly oblique viewing angles. To achieve high-quality ground-level visualization one needs to capture data from the ground. Recently, the acquisition of ground-level videos of primarily cities using cameras mounted on vehicles has been the focus of several research and commercial initiatives, such as Google Earth and Microsoft Virtual

2 Earth, that aim at generating high-quality visualizations. Such visualizations can be in the form of panoramic mosaics [1,2], simple geometric models [3] or accurate and detailed 3D models such as the ones produced by our method. The construction of mosaics and simple geometric models is not very computationally intensive, but the resulting visualizations are either as simple as the employed models or limited to rotations around pre-specified viewpoints. For instance, the system of Cornelis et al. [3] reconstructs three planes, the ground and two facades parallel to the driving direction, for each frame. While it can provide effective visualization from the perspective the videos were captured, the visual quality from other viewpoints is limited. In this paper, we investigate the reconstruction of accurate 3D models from a large set of images with emphasis on urban environments. These models offer considerably more flexible visualization in the form of walk-throughs and fly-overs. Moreover, due to the use of GPS information, the models are accurately geo-located and measurements with a precision of a few cm can be made on them. Examples of ground-level reconstructions from our system can be seen in Figure 1. The massive amounts of data needed in order to generate 3D models of complete cities pose significant challenges for both the data collection and processing systems, which we address in this paper. Figure 1: Screenshots of reconstructed models

3 The video acquisition system consists of eight cameras mounted on a vehicle, with a quadruple of cameras looking to each side. The cameras have a resolution of pixels and a frame rate of 30 Hz. Additionally, the acquisition system employs an Inertial Navigation System (INS) and a GPS, which are synchronized with the cameras, to enable geo-registration of the cameras. As the vehicle is driven through urban environments, the captured video is stored on disk drives in the vehicle. After a capture session, the drives are moved from the vehicle to a computer cluster for processing. Processing entails estimating the vehicle s trajectory, recovering the camera poses at the time instances when frames were captured, reconstructing 3D points based on stereo correspondence and finally creating a model in the form of a polygonal mesh with associated texture maps. We show results on several video sequences that consist of up to hundreds of thousands of frames and also present a quantitative evaluation of the accuracy and completeness of our reconstructions by comparing them with accurately surveyed models of a building. This paper is organized as follows: the next section is an overview of related work; Section 3 presents the acquisition system; Section 4 discusses different pose estimation alternatives; Section 5 describes our 3D reconstruction pipeline; Section 6 shows experimental results; and Section 7 concludes the paper. 2. RELATED WORK The research community has devoted a lot of effort to the modeling of man-made environments using a combination of sensors and modalities. Here, we briefly review work relying on ground-based imaging since it is more closely related to our project. An equal, if not larger, volume of work exists for aerial imaging. The goal is the accurate reconstruction of urban or archaeological sites, including both geometry and texture, in order to obtain models useful for visualization, for quantitative analysis in the form of measurements at large or small scales and potentially for studying their evolution through time. A natural choice to satisfy the requirement of modeling the geometry and appearance is the combined use of active range scanners and digital cameras. Stamos and Allen [4] used such a combination, while also addressing the problems of registering the two modalities, segmenting the data and fitting planes to the

4 point cloud. El-Hakim et al. [5] propose a methodology for selecting the most appropriate modality among range scanners, ground and aerial images and CAD models. Früh and Zakhor [6] developed a system that is similar to ours since it is also mounted on a vehicle and captures large amounts of data in continuous mode, in contrast to the previous approaches that captured data from a few distinct viewpoints. Their system consists of two laser scanners, one for map construction and registration and one for geometry reconstruction, and a digital camera, for texture acquisition. A system with similar configuration, but smaller size, that also operates in continuous mode was presented by Biber et al. [7]. Laser scanners have the advantage of providing accurate 3D measurements directly. On the other hand, they can be cumbersome and expensive. Researchers in photogrammetry and computer vision address the problem of reconstruction relying solely on passive sensors (cameras) in order to increase the flexibility of the system while decreasing its size, weight and cost. The challenges are due mostly to the well-document inaccuracies in 3D reconstruction from 2D measurements. This is a problem that has received a lot of attention from computer vision and photogrammetry. Due to the size of the relevant literature we refer interested readers to the books by Faugeras [8] and Hartley and Zisserman [9] for computer vision based treatments and the Manual of Photogrammetry [10] for the photogrammetric approach. It should be noted here that while close-range photogrammetry solutions would be effective for 3D modeling of buildings, they are often interactive and would not efficiently scale to a complete city. Among the first attempts for 3D reconstruction of large scale environments using computer vision techniques was the MIT City Scanning project [11]. The goal of the project was the construction of panoramas from a large collection of images of the MIT campus. A semi-automatic approach under which simple geometric primitives are fitted to the data was proposed by Debevec et al. [12]. Compelling models can be reconstructed even though fine details are not modeled but treated as texture instead. Rother and Carlsson [13] show that multiple-view reconstruction can be formulated as a linear estimation problem given a known fixed plane that is visible in all images. Werner and Zisserman [14] presented an automatic method, inspired by [12], that fits planes and polyhedra on sparse reconstructed primitives by examining the support they receive via a modified version of the space sweep algorithm [15]. Research on large scale

5 urban modeling includes the 4D Atlanta project described by Schindler and Dellaert in [16], which also examines the evolution of the model through time. Recently, Cornelis et al. [3] presented a system for realtime modeling of cities that employs a very simple model for the geometry of the world. Specifically, they assume that the scenes consist of three planes: the ground and two facades orthogonal to it. A stereo algorithm that matches vertical lines across two calibrated cameras is used to reconstruct the facades, while objects that are not consistent with the facades or the ground are suppressed. We approach the problem using passive sensors only, building upon the experience from the intensive study of structure from motion and shape reconstruction within the computer vision community [17,18,19,20,21]. The emphasis in our project is on the development of a fully automatic system that is able to operate in continuous mode without user intervention or the luxury of precise viewpoint selection since capturing is performed from a moving vehicle and thus constrained to the vantage points of the streets. Our system design is also driven by the performance goal of being able to process the large video datasets in a time at most equal to that of acquisition. 3. ACQUISITION SYSTEM AND CALIBRATION In this section, we describe the hardware used in our data acquisition system that records videos, GPS and INS data to a set of hard drives. Each video frame and sensor measurement is associated with a timestamp which serves to synchronize all data for further processing. We proceed with an overview of the calibration procedure that is employed to estimate the intrinsic parameters of the cameras as well as their position and orientation relative to the vehicle Data acquisition system The on-vehicle video acquisition system consists of two main sub-systems: an eight-camera digital video recorder (DVR) and an Applanix GPS/INS (model POS LV) navigation system. The DVR streams the raw images to disk, and the Applanix system tracks the vehicle s position and orientation. The DVR is built with eight Point Grey Research (PGR) Flea color cameras, with one quadruple of cameras for each side of the vehicle. Each camera has a field-of-view of approximately 40 o 30 o, and within a quadruple the cameras are arranged with minimal overlap in field-of-view. As shown in Figure 2, three cameras are mounted in

6 the same plane to create a horizontal field of view of approximately 120 degrees. The fourth camera is tilted upward to create an overall vertical field of view of approximately 60 degrees with the side-looking camera. Figure 2: Left: A camera quadruple such as the ones used in our system. Right: A frame of each camera arranged according to the quadruple configuration. The cameras are interfaced via IEEE-1394 connectors to eight Windows-Intel VME-based processor modules from Concurrent Technologies, each with two high-capacity SATA data drives. The eight-camera DVR is capable of streaming to disk Bayer-pattern images at 30Hz. The CCD exposure on all cameras is synchronized using IEEE-1394 synchronization units from PGR. With each video frame recorded, one of the VME processor modules also records a GPS-timestamp message from the Applanix navigation system. These messages are synchronized to the CCD exposure by means of an external TTL signal output by one of the cameras. The GPS timestamps are used to recover the position and orientation of the vehicle at the corresponding time from the post-processed trajectory. Recovering camera positions and orientations, however, requires knowledge of the hand-eye calibration between the cameras and the GPS/INS system that one establishes during system calibration Calibration The first stage of calibration is the estimation of the intrinsic parameters of the camera which determine the mapping of rays emanating from the optical center to pixels in the image. These parameters include the focal length, aspect ratio and principal point as well as nonlinear parameters such as lens distortion. All

7 parameters are jointly estimated using a planar calibration object and applying Zhang s algorithm [22] as implemented by Ilie and Welch [23]. During video collection, we set the camera gains to the automatic setting to adjust for lighting variations. The changes in gain are handled by the reconstruction module described in Section 5. We do not attempt to estimate the external relationships of cameras since there is little or no overlap in their fields of view. Instead, we estimate the calibration of the cameras relative to the GPS/INS coordinate system. This process is termed hand-eye or lever-arm calibration. The hand-eye transformation, which consists of a rotation and a translation, between the camera coordinate system and the vehicle coordinate system, which is in UTM coordinates given by the GPS sensor, is computed by detecting the positions of a few surveyed points in a few frames taken from different viewpoints. The pose of the camera can be determined for each frame from the projections of the known 3D points and the pose of the vehicle is given by the sensor measurements. The optimal linear transformation can be computed using this information. After the hand-eye transformation for each camera has been computed, all 3D positions and orientations of points, surfaces or cameras can be transformed to UTM coordinates. This is accomplished using the timestamps, which serve as an index to the vehicle s trajectory, to retrieve the vehicle s pose at the time instance an image was captured. We can, then, compute the pose of a particular camera by applying the appropriate hand-eye transformation to the vehicle pose. 4. POSE ESTIMATION Accurate camera poses are critical for 3D reconstruction, during which the depth of pixels is estimated. The reconstructed 3D points are placed on the ray going through the optical center and the corresponding pixels at a distance from the optical center equal to the estimated depth. If the camera pose is not estimated correctly, these points will be off and the resulting 3D model will be distorted or completely incomprehensible. In this section, we present three alternatives for estimating camera pose with varying degrees of reliance on the GPS/INS measurements and vision-based tracking. It should be pointed out that since the cameras have very little or no overlap, our system can be perceived as eight single-camera systems, unless a sophisticated multi-camera tracking module is employed.

8 The first option is to estimate camera poses relying solely on image data. This process is called structure from motion and has received a lot of attention in the computer vision literature [24,25,26,17,18,19,20]. Processing begins by automatically detecting 2D features in a frame and trying to locate corresponding features in subsequent frames. For this purpose, we use an implementation of the KLT tracking algorithm [27] on the Graphics Processing Unit (GPU) [28] which can achieve throughputs faster than 30 Hz. Since the intrinsic parameters of the cameras are known, we can use Nistér s five-point algorithm [19], which is very fast and robust to degenerate scene configurations. This option produces a trajectory for each camera which is Euclidean in the sense that angles and length ratios are preserved between the actual and estimated poses. A global rotation, translation and scaling transformation is necessary to align the camera trajectory and the models with the world. The inability to geo-register and determine the absolute size of the model along with the drift that typically occurs in vision-based tracking are the limitations of this option. Its advantage is that it does not require the expensive INS sensor and thus reduces the cost of the system significantly. The second option is to combine vision-based pose estimation with a low-end GPS sensor. Unlike visionbased tracking, GPS sensors are globally accurate, typically within 2-3m, but susceptible to short-term drift. The advantage of this option is that it combines two measurements with different characteristics and it can benefit from the small-scale consistency of the vision measurements and the large-scale consistency of the GPS data. This option produces geo-registered models of known size while keeping the cost of the system in a reasonable range. The third option, which is the one we employ for most of the results shown in this paper, is to rely on a very accurate GPS/INS system to compute the trajectory. Our experiments have shown that the Applanix system can provide pose estimates that are accurate enough to be used as inputs for 3D reconstruction. Even though, these estimates can be improved to some extent using vision-based measurements, we use this alternative because it enables our system to achieve real-time performance by bypassing vision-based pose estimation.

9 5. 3D RECONSTRUCTION The 3D reconstruction module consists of three parts: multiple-view stereo that computes depthmaps given images and the corresponding camera poses, depthmap fusion that merges the depthmaps and model construction which generates the 3D mesh and defines the mapping of texture on the faces of the mesh. Note that the video sequence of each camera is processed separately here Multiple-view stereo The multiple-view stereo module takes as input the camera poses and images from a single video stream and produces a depth map for each frame. It uses a modified version of the plane-sweep algorithm of Collins [15], which is an efficient multi-image matching technique. Conceptually, a plane is swept through space in steps along a predefined direction, typically parallel to the optical axis. At each position of the plane, all views are projected on it. If a point on the plane is at the correct depth, then, according to the brightness constancy assumption, all pixels that project on it should have consistent intensities. We measure this consistency by summing the absolute intensity differences in square aggregation windows defined in the reference image, for which the depth map is computed. The hypothesis with the minimum cost (sum of absolute differences) is selected as the depth estimate for each pixel. We opted for the plane-sweep algorithm because it can be easily parallelized and thus is amenable to a very fast GPU implementation as shown by Yang and Pollefeys [29]. We have made several augmentations to the original algorithm to improve the quality of the reconstructions and to take advantage of the structure exhibited in man-made environments, which typically contain planar surfaces that are orthogonal to each other. We begin by identifying the scene s principal plane orientations combining information from the trajectory of the vehicle as well as the distribution of the tracked features, if vision-based tracking is performed. Once these principal plane orientations are known, we can sweep a plane in all of them. Three orientations are typically sufficient to overcome the bias of the single sweep towards planes orthogonal to the direction of the sweep and to reconstruct slanted surfaces. As mentioned in Section 3, a key feature of our stereo module is that it compensates for global camera gain changes without impacting the real-time performance [30]. This is very important for outdoor applications where the brightness range often far exceeds the dynamic range of the camera.

10 The stereo module is invoked to produce a depthmap typically for every frame using between 4 and 10 other images to evaluate the cost of each depth estimate. Due to being called this frequently, stereo consumes the largest portion of our computational resources. Nevertheless, it can achieve a throughput of more than 100 frames per second on images using a single high-end commodity GPU Depthmap Fusion Multiple-view stereo provides a depthmap for every frame. Since we do not enforce consistency between depthmaps during stereo, we need to enforce it in a separate depthmap fusion step. Fusion serves two purposes: it improves the quality of the depth estimates by ensuring that estimates for the same point are consistent across multiple depthmaps and it produces more economical representations by merging multiple depth estimates into one. Related work includes volumetric [31] and viewpoint-based [32,33] approaches. We select a viewpoint-based approach since a volumetric one would be impractical due to memory and processing requirements. Given depthmaps from a set of consecutive frames, the depthmap fusion step resolves conflicts between computed depths and produces a depthmap in which most of the noise has been removed. By dividing the processing into two stages we are able to achieve very high throughput mostly because we are able to use a computationally cheap stereo algorithm and because both stages are suitable for hardware-accelerated GPU implementations. Fusion begins by rendering all depthmaps onto the reference depthmap which is typically the central one. Visibility reasoning is used to detect conflicts and determine which depth estimates are the most consistent. Our approach considers three types of visibility relations for the hypothesized depths in the reference view with respect to depth estimates originating from other views. These are illustrated in Figure 3. In view i, the observed surface passes behind A as A is observed in this view. There is a conflict between these two measurements since view i would not be able to observe A if there was truly a surface at A. We say that A violates the free space of A. The second relation occurs between B and B which are in close proximity of each other. We say that these depth estimates support each other. The third relationship is illustrated by C and C. C which is rendered from view i to the reference view, is in front of C. There is a conflict between

11 these two measurements since it would not be possible to observe C if there was truly a surface at C. We say that C occludes C. Figure 3: Illustration of visibility reasoning. A, B and C are depth estimates of the reference depthmap and A, B and C are depth estimates of depthmap i rendered onto the reference depthmap. We have developed a rigorous formulation based on the notion of stability of a depth estimate, which aims to determine the validity of a depth estimate by rendering multiple depthmaps into the reference view as well as the reference depthmap into the other views in order to detect occlusions and free-space violations. We have also developed an approximate alternative formulation that selects and validates only one hypothesis based on the accumulated confidence of the most likely depth candidate. Both algorithms enable us to perform video-based reconstruction at tens of frames per second. Details have been omitted due to lack of space Model construction The depthmap fusion step generates a depth estimate for each pixel of the reference view which is passed to the model module along with the image taken from that viewpoint. The output is a model in the form of a

12 set of triangles in 3D whose faces are piecewise planar approximations of the scene surfaces. Each triangle is associated with a part of the image to add appearance to the geometry. Exploiting the fact that the image plane can serve as reference for both geometry and appearance, we can construct the triangular mesh rapidly. We employ a multi-resolution quad-tree algorithm in order to minimize the number of triangles while maintaining geometric accuracy as in [34]. The vertices of the model are constructed by placing points in 3D along the rays of the reference camera at distances equal to the fused depth estimates. We then split the image in fairly large quads and attempt to fit triangles to the vertices corresponding to the corners of the quads. If the triangles pass a planarity test they are kept, otherwise the quad is subdivided and the process is repeated at a finer resolution. A part of a multiresolution mesh can be seen in Figure 4. The known correspondence between the vertices of the model and pixel positions in the images allows us to determine where each triangle projects and thus which part of the image to use for its appearance. The model construction stage also suppresses the generation of duplicate representations of the same surface. This is necessary since one cannot predict a priori if a surface is visible in more than one fused depthmap. Figure 4: Left: Input image. Right: Illustration of multi-resolution triangular mesh 6. EXPERIMENTAL RESULTS Our processing pipeline can achieve very high frame rates by leveraging both the CPU and GPU of a highend personal computer. An implementation of our software targeted towards speed can generate 3D models at rates faster than those of video collection. This can be achieved with the following settings:

13 GPS/INS-only pose estimation, 7 color images with a resolution of as input to the multiple-view stereo module which evaluates 48 plane positions and 11 depthmaps as input to the fusion stage. This configuration achieves a frame rate of more than 50 Hz on PC with a dual-core AMD Opteron processor at 2.4GHz with an NVidia GeForce 8800 series GPU. Screenshots of 3D models generated by our system can be seen in Figure 5. Figure 5: Screenshots of 3D models from urban and sub-urban environments Quantitative evaluation To evaluate the ability of our method to reconstruct urban environments, a 3,000 frame video of the exterior of a Firestone store which was captured using two cameras. One of the cameras was pointed horizontally towards the middle and bottom of the building and the other was pointed up 30 degrees towards the building. The videos from each camera were processed separately. For the evaluation, the Firestone building was surveyed to an accuracy of 6mm and the reconstructed model was directly compared with the surveyed model (Figure 6). There are several objects such as parked cars that are visible in the video, but were not surveyed. Several of the doors of the building were left open causing some of the interior of the building to be reconstructed. Since accurate measurements of these objects were not available in the ground truth the reconstructed objects were not considered in the evaluation. For all of the remaining parts, the distance from each point to

14 the nearest point on the ground truth model was calculated to measure the accuracy of the reconstruction. A visualization showing these distances is provided in Figure 6, in which blue, green and red denote errors of 0cm, 30cm and 60cm or above, respectively. A histogram of these distances is shown in Figure 6. 83% of the points were reconstructed within 5cm of the ground truth model. (The building is 80m by 40m.) We also evaluated the completeness of our model by reversing the roles of the model and the ground truth. In this case, we measure the distance from each point in the ground truth model to the nearest point in the reconstructed model. Points in the ground truth that have been reconstructed poorly or have not been reconstructed at all, result in large values for this distance. Figure 6 also shows a visualization of the completeness evaluation using the same color scheme as the accuracy evaluation. Note that some parts of the building remain invisible during the entire sequence. Figure 6: Reconstruction and evaluation results for the Firestone building. Top left: reconstructed model. Top right: surveyed model. Bottom left: accuracy evaluation (blue, green and red denote errors of 0cm, 30cm and 60cm or above, respectively). Bottom right: completeness evaluation (color coding same as bottom left). 7. CONCLUSIONS We have described a system aimed at real-time, detailed, geo-registered, 3D urban reconstruction from video captured by a multi-camera system and GPS/INS measurements. The quality of the reconstructions, with respect to both accuracy and completeness, is very satisfactory, especially considering the very high processing speeds we achieve. The system can scale to very large environments since it does not require

15 markers or manual measurements and is capable of capturing and processing large amounts of data in realtime by utilizing both the CPU and GPU. The models can be viewed from arbitrary viewpoints thus providing more effective visualization than mosaics or panoramas. One can virtually walk around the model at ground-level, fly over it in arbitrary directions, zoom in or out. Since the models are metric, they enable the user to make accurate measurements by selecting points or surfaces on them. Parts of a model can be removed or other models can be inserted to augment the existing model. The additions can be either reconstructed models from a different location or synthetic. Effective visualizations of potential modifications in an area can be generated this way. Finally, our system can be used for archiving and change detection. Future challenges include a scheme for better utilization of the multiple video streams to improve the quality of pose estimation and to reconstruct each surface under the optimal viewing conditions while reducing the redundancy of the overall model. Along a different axis, we intend to investigate ways of decreasing the cost and size of our system by further exploring the implementation that uses a low-end INS and GPS or even only GPS instead of the Applanix system and by compressing the videos on the data acquisition platform. Acknowledgements This work has been supported in part by the DARPA UrbanScape program and the DTO VACE program. References 1. Teller S., Antone, M., Bodnar, Z., Bosse, M., Coorg, S., Jethwa, M. and Master, N. Calibrated, registered images of an extended urban area, Int. Journ. of Computer Vision, 2003, 53(1), Román, A., Garg, G. and Levoy, M., Interactive design of multiperspective images for visualizing urban landscapes, in IEEE Visualization, 2004, Cornelis, N., Cornelis, K. and Van Gool, L., Fast compact city modeling for navigation previsualization, in Int. Conf. on Computer Vision and Pattern Recognition, 2006.

16 4. Stamos, I. and Allen, P.K., Geometry and texture recovery of scenes of large scale, Computer Vision and Image Understanding, 2002, 88(2), El-Hakim, S.F., Beraldin, J.-A., Picard, M. and Vettore, A., Effective 3D modeling of heritage sites, in 4th International Conference of 3D Imaging and Modeling, 2003, Früh, C. and Zakhor, A., An automated method for large-scale, ground-based city model acquisition, Int. J. of Computer Vision, 2004, 60(1), Biber, P., Fleck, S., Staneker, D., Wand, M. and Strasser, W., First experiences with a mobile platform for flexible 3D model acquisition in indoor and outdoor environments the Waegele, in ISPRSWorking Group V/4: 3D-ARCH, Faugeras, O.D., Three-Dimensional Computer Vision: A Geometric Viewpoint, MIT Press, Hartley, R. and Zisserman, A., Multiple View Geometry in Computer Vision, Cambridge University Press, American Society of Photogrammetry, Manual of Photogrammetry (5th edition), Asprs Pubns, Teller, S., Automated urban model acquisition: Project rationale and status, in Image Understanding Workshop, 1998, Debevec, P., Taylor, C.J., Malik, J., Modeling and rendering architecture from photographs: A hybrid geometry- and image-based approach, in SIGGRAPH, 1996, Rother, C. and Carlsson, S., Linear multi view reconstruction and camera recovery using a reference plane, Int. J. of Computer Vision, 2002, 49(2-3), Werner, T. and Zisserman, A., New techniques for automated architectural reconstruction from photographs, in European Conf. on Computer Vision, 2002,

17 15. Collins, R.T., A space-sweep approach to true multi-image matching, in Int. Conf. on Computer Vision and Pattern Recognition, 1996, Schindler, G. and Dellaert, F., Atlanta world: an expectation maximization framework for simultaneous low-level edge grouping and camera calibration in complex man-made environments, in Int. Conf. on Computer Vision and Pattern Recognition, 2004, Pollefeys, M. Koch, R., and Van Gool, L., Self-calibration and metric reconstruction in spite of varying and unknown intrinsic camera parameters, Int. Journ. of Computer Vision, 1999, 32(1), Nistér, D., Automatic dense reconstruction from uncalibrated video sequences, PhD Thesis, Royal Institute of Technology KTH, Stockholm, Sweden, Nistér, D., An efficient solution to the five-point relative pose problem, IEEE Trans. on Pattern Analysis and Machine Intelligence, 2004, 26(6), Pollefeys, M., Van Gool, L., Vergauwen, M., Verbiest, F., Cornelis, K., Tops, J., and Koch, R., Visual modeling with a hand-held camera, Int. Journ. of Computer Vision, 2004, 59(3), Pollefeys, M., Van Gool, L., Vergauwen, M., Cornelis, K., Verbiest, F., and Tops, J., Imagebased 3d recording for archaeological field work, Computer Graphics and Applications, 2003, 23(3), Zhang, Z., A flexible new technique for camera calibration, IEEE Trans. on Pattern Analysis and Machine Intelligence, 2000, 22(11), Ilie A. and Welch, G., Ensuring Color Consistency across Multiple Cameras, in Int. Conf. on Computer Vision, 2005, Faugeras, O., Luong, Q.-T. and Maybank, S., Camera self-calibration: theory and experiments, in European Conf. on Computer Vision. 1992,

18 25. Hartley, R., Euclidean reconstruction from uncalibrated views, in Applications of Invariance in Computer Vision, 1994, Koch, R. Pollefeys, M., Heigl, B., Van Gool, L.J. and Niemann, H., Calibration of hand-held camera sequences for plenoptic modeling, in Int. Conf. on Computer Vision, 1999, Lucas, B.D. and Kanade, T., An iterative image registration technique with an application to stereo vision, in Int. Joint Conf. on Artificial Intelligence, 1981, Sinha, S., Frahm, J.-M., Pollefeys, M. and Genc, Y., eature tracking and matching in video using programmable graphics hardware, Machine Vision and Applications, Yang R. and Pollefeys, M., Multi-resolution real-time stereo on commodity graphics hardware, in Int. Conf. on Computer Vision and Pattern Recognition, 2003, Kim, S.J., Gallup, D., Frahm, J.-M., Akbarzadeh, A., Yang, Q., Yang, R., Nistér, D. and Pollefeys, M., Gain adaptive real-time stereo streaming, in Int. Conf. on Vision Systems, Curless B. and Levoy, M., A volumetric method for building complex models from range images, in SIGGRAPH, 1996, Narayanan, P.J., Rander, P.W. and Kanade, T., Constructing virtual worlds using dense stereo, in Int. Conf. on Computer Vision, 1998, Koch, R., Pollefeys, M. and Van Gool, L.J., Multi viewpoint stereo from uncalibrated video sequences, in European Conf. on Computer Vision, 1998, Pajarola, R., Overview of quadtree-based terrain triangulation and visualization, Tech. Rep. UCI- ICS-02-01, Information & Computer Science,University of California Irvine, 2002.

REAL-TIME VIDEO-BASED RECONSTRUCTION OF URBAN ENVIRONMENTS

REAL-TIME VIDEO-BASED RECONSTRUCTION OF URBAN ENVIRONMENTS REAL-TIME VIDEO-BASED RECONSTRUCTION OF URBAN ENVIRONMENTS P. Mordohai 1, J.-M. Frahm 1, A. Akbarzadeh 2, B. Clipp 1, C. Engels 2, D. Gallup 1, P. Merrell 1, C. Salmi 1, S. Sinha 1, B. Talton 1, L. Wang

More information

Towards Urban 3D Reconstruction From Video

Towards Urban 3D Reconstruction From Video Towards Urban 3D Reconstruction From Video A. Akbarzadeh, J.-M. Frahm +, P. Mordohai +, B. Clipp +, C. Engels, D. Gallup +, P. Merrell +, M. Phelps, S. Sinha +, B. Talton +, L. Wang, Q. Yang, H. Stewenius,

More information

Gain Adaptive Real-Time Stereo Streaming

Gain Adaptive Real-Time Stereo Streaming Gain Adaptive Real-Time Stereo Streaming S. J. Kim +, D. Gallup +, J.-M. Frahm +, A. Akbarzadeh, Q. Yang, R. Yang, D. Nistér and M. Pollefeys + + Department of Computer Science Department of Computer Science

More information

Evaluation of Large Scale Scene Reconstruction

Evaluation of Large Scale Scene Reconstruction Evaluation of Large Scale Scene Reconstruction Paul Merrell, Philippos Mordohai, Jan-Michael Frahm, and Marc Pollefeys Department of Computer Science University of North Carolina, Chapel Hill, USA Abstract

More information

Real-Time Visibility-Based Fusion of Depth Maps

Real-Time Visibility-Based Fusion of Depth Maps Real-Time Visibility-Based Fusion of Depth Maps Paul Merrell 1, Amir Akbarzadeh 2, Liang Wang 2, Philippos Mordohai 1, David Nistér 2 and Marc Pollefeys 1 1 Department of Computer Science 2 Center for

More information

Static Scene Reconstruction

Static Scene Reconstruction GPU supported Real-Time Scene Reconstruction with a Single Camera Jan-Michael Frahm, 3D Computer Vision group, University of North Carolina at Chapel Hill Static Scene Reconstruction 1 Capture on campus

More information

Outdoor Scene Reconstruction from Multiple Image Sequences Captured by a Hand-held Video Camera

Outdoor Scene Reconstruction from Multiple Image Sequences Captured by a Hand-held Video Camera Outdoor Scene Reconstruction from Multiple Image Sequences Captured by a Hand-held Video Camera Tomokazu Sato, Masayuki Kanbara and Naokazu Yokoya Graduate School of Information Science, Nara Institute

More information

3D model search and pose estimation from single images using VIP features

3D model search and pose estimation from single images using VIP features 3D model search and pose estimation from single images using VIP features Changchang Wu 2, Friedrich Fraundorfer 1, 1 Department of Computer Science ETH Zurich, Switzerland {fraundorfer, marc.pollefeys}@inf.ethz.ch

More information

An Image Based 3D Reconstruction System for Large Indoor Scenes

An Image Based 3D Reconstruction System for Large Indoor Scenes 36 5 Vol. 36, No. 5 2010 5 ACTA AUTOMATICA SINICA May, 2010 1 1 2 1,,,..,,,,. : 1), ; 2), ; 3),.,,. DOI,,, 10.3724/SP.J.1004.2010.00625 An Image Based 3D Reconstruction System for Large Indoor Scenes ZHANG

More information

BUILDING POINT GROUPING USING VIEW-GEOMETRY RELATIONS INTRODUCTION

BUILDING POINT GROUPING USING VIEW-GEOMETRY RELATIONS INTRODUCTION BUILDING POINT GROUPING USING VIEW-GEOMETRY RELATIONS I-Chieh Lee 1, Shaojun He 1, Po-Lun Lai 2, Alper Yilmaz 2 1 Mapping and GIS Laboratory 2 Photogrammetric Computer Vision Laboratory Dept. of Civil

More information

FAST REGISTRATION OF TERRESTRIAL LIDAR POINT CLOUD AND SEQUENCE IMAGES

FAST REGISTRATION OF TERRESTRIAL LIDAR POINT CLOUD AND SEQUENCE IMAGES FAST REGISTRATION OF TERRESTRIAL LIDAR POINT CLOUD AND SEQUENCE IMAGES Jie Shao a, Wuming Zhang a, Yaqiao Zhu b, Aojie Shen a a State Key Laboratory of Remote Sensing Science, Institute of Remote Sensing

More information

arxiv: v1 [cs.cv] 28 Sep 2018

arxiv: v1 [cs.cv] 28 Sep 2018 Camera Pose Estimation from Sequence of Calibrated Images arxiv:1809.11066v1 [cs.cv] 28 Sep 2018 Jacek Komorowski 1 and Przemyslaw Rokita 2 1 Maria Curie-Sklodowska University, Institute of Computer Science,

More information

AUTOMATED 3D MODELING OF URBAN ENVIRONMENTS

AUTOMATED 3D MODELING OF URBAN ENVIRONMENTS AUTOMATED 3D MODELING OF URBAN ENVIRONMENTS Ioannis Stamos Department of Computer Science Hunter College, City University of New York 695 Park Avenue, New York NY 10065 istamos@hunter.cuny.edu http://www.cs.hunter.cuny.edu/

More information

Camera Calibration for a Robust Omni-directional Photogrammetry System

Camera Calibration for a Robust Omni-directional Photogrammetry System Camera Calibration for a Robust Omni-directional Photogrammetry System Fuad Khan 1, Michael Chapman 2, Jonathan Li 3 1 Immersive Media Corporation Calgary, Alberta, Canada 2 Ryerson University Toronto,

More information

A COMPREHENSIVE TOOL FOR RECOVERING 3D MODELS FROM 2D PHOTOS WITH WIDE BASELINES

A COMPREHENSIVE TOOL FOR RECOVERING 3D MODELS FROM 2D PHOTOS WITH WIDE BASELINES A COMPREHENSIVE TOOL FOR RECOVERING 3D MODELS FROM 2D PHOTOS WITH WIDE BASELINES Yuzhu Lu Shana Smith Virtual Reality Applications Center, Human Computer Interaction Program, Iowa State University, Ames,

More information

SIMPLE ROOM SHAPE MODELING WITH SPARSE 3D POINT INFORMATION USING PHOTOGRAMMETRY AND APPLICATION SOFTWARE

SIMPLE ROOM SHAPE MODELING WITH SPARSE 3D POINT INFORMATION USING PHOTOGRAMMETRY AND APPLICATION SOFTWARE SIMPLE ROOM SHAPE MODELING WITH SPARSE 3D POINT INFORMATION USING PHOTOGRAMMETRY AND APPLICATION SOFTWARE S. Hirose R&D Center, TOPCON CORPORATION, 75-1, Hasunuma-cho, Itabashi-ku, Tokyo, Japan Commission

More information

Estimation of Camera Pose with Respect to Terrestrial LiDAR Data

Estimation of Camera Pose with Respect to Terrestrial LiDAR Data Estimation of Camera Pose with Respect to Terrestrial LiDAR Data Wei Guan Suya You Guan Pang Computer Science Department University of Southern California, Los Angeles, USA Abstract In this paper, we present

More information

Efficient Cost Volume Sampling for Plane Sweeping Based Multiview Depth Estimation

Efficient Cost Volume Sampling for Plane Sweeping Based Multiview Depth Estimation Tampere University of Technology Efficient Cost Volume Sampling for Plane Sweeping Based Multiview Depth Estimation Citation Suominen, O., & Gotchev, A. (5). Efficient Cost Volume Sampling for Plane Sweeping

More information

Hand-Eye Calibration from Image Derivatives

Hand-Eye Calibration from Image Derivatives Hand-Eye Calibration from Image Derivatives Abstract In this paper it is shown how to perform hand-eye calibration using only the normal flow field and knowledge about the motion of the hand. The proposed

More information

1-2 Feature-Based Image Mosaicing

1-2 Feature-Based Image Mosaicing MVA'98 IAPR Workshop on Machine Vision Applications, Nov. 17-19, 1998, Makuhari, Chibq Japan 1-2 Feature-Based Image Mosaicing Naoki Chiba, Hiroshi Kano, Minoru Higashihara, Masashi Yasuda, and Masato

More information

A Heightmap Model for Efficient 3D Reconstruction from Street-Level Video

A Heightmap Model for Efficient 3D Reconstruction from Street-Level Video A Heightmap Model for Efficient 3D Reconstruction from Street-Level Video David Gallup 1, Jan-Michael Frahm 1 1 University of North Carolina Department of Computer Science {gallup,jmf}@cs.unc.edu Marc

More information

AUTOMATED 3D RECONSTRUCTION OF URBAN AREAS FROM NETWORKS OF WIDE-BASELINE IMAGE SEQUENCES

AUTOMATED 3D RECONSTRUCTION OF URBAN AREAS FROM NETWORKS OF WIDE-BASELINE IMAGE SEQUENCES AUTOMATED 3D RECONSTRUCTION OF URBAN AREAS FROM NETWORKS OF WIDE-BASELINE IMAGE SEQUENCES HelmutMayer, JanBartelsen Institute of Geoinformation and Computer Vision, Bundeswehr University Munich - (Helmut.Mayer,

More information

Stereo vision. Many slides adapted from Steve Seitz

Stereo vision. Many slides adapted from Steve Seitz Stereo vision Many slides adapted from Steve Seitz What is stereo vision? Generic problem formulation: given several images of the same object or scene, compute a representation of its 3D shape What is

More information

Accurate 3D Face and Body Modeling from a Single Fixed Kinect

Accurate 3D Face and Body Modeling from a Single Fixed Kinect Accurate 3D Face and Body Modeling from a Single Fixed Kinect Ruizhe Wang*, Matthias Hernandez*, Jongmoo Choi, Gérard Medioni Computer Vision Lab, IRIS University of Southern California Abstract In this

More information

Image-Based Modeling and Rendering. Image-Based Modeling and Rendering. Final projects IBMR. What we have learnt so far. What IBMR is about

Image-Based Modeling and Rendering. Image-Based Modeling and Rendering. Final projects IBMR. What we have learnt so far. What IBMR is about Image-Based Modeling and Rendering Image-Based Modeling and Rendering MIT EECS 6.837 Frédo Durand and Seth Teller 1 Some slides courtesy of Leonard McMillan, Wojciech Matusik, Byong Mok Oh, Max Chen 2

More information

3D MODELING OF OUTDOOR SCENES BY INTEGRATING STOP-AND-GO AND CONTINUOUS SCANNING OF RANGEFINDER

3D MODELING OF OUTDOOR SCENES BY INTEGRATING STOP-AND-GO AND CONTINUOUS SCANNING OF RANGEFINDER 3D MODELING OF OUTDOOR SCENES BY INTEGRATING STOP-AND-GO AND CONTINUOUS SCANNING OF RANGEFINDER Toshihiro ASAI, Masayuki KANBARA, Naokazu YOKOYA Graduate School of Information Science, Nara Institute of

More information

CSc Topics in Computer Graphics 3D Photography

CSc Topics in Computer Graphics 3D Photography CSc 83010 Topics in Computer Graphics 3D Photography Tuesdays 11:45-1:45 1:45 Room 3305 Ioannis Stamos istamos@hunter.cuny.edu Office: 1090F, Hunter North (Entrance at 69 th bw/ / Park and Lexington Avenues)

More information

Detailed Real-Time Urban 3D Reconstruction from Video

Detailed Real-Time Urban 3D Reconstruction from Video DOI 10.1007/s11263-007-0086-4 Detailed Real-Time Urban 3D Reconstruction from Video M. Pollefeys D. Nistér J.-M. Frahm A. Akbarzadeh P. Mordohai B. Clipp C. Engels D. Gallup S.-J. Kim P. Merrell C. Salmi

More information

VIDEO-TO-3D. Marc Pollefeys, Luc Van Gool, Maarten Vergauwen, Kurt Cornelis, Frank Verbiest, Jan Tops

VIDEO-TO-3D. Marc Pollefeys, Luc Van Gool, Maarten Vergauwen, Kurt Cornelis, Frank Verbiest, Jan Tops VIDEO-TO-3D Marc Pollefeys, Luc Van Gool, Maarten Vergauwen, Kurt Cornelis, Frank Verbiest, Jan Tops Center for Processing of Speech and Images, K.U.Leuven Dept. of Computer Science, University of North

More information

Feature Transfer and Matching in Disparate Stereo Views through the use of Plane Homographies

Feature Transfer and Matching in Disparate Stereo Views through the use of Plane Homographies Feature Transfer and Matching in Disparate Stereo Views through the use of Plane Homographies M. Lourakis, S. Tzurbakis, A. Argyros, S. Orphanoudakis Computer Vision and Robotics Lab (CVRL) Institute of

More information

Probabilistic Matching for 3D Scan Registration

Probabilistic Matching for 3D Scan Registration Probabilistic Matching for 3D Scan Registration Dirk Hähnel Wolfram Burgard Department of Computer Science, University of Freiburg, 79110 Freiburg, Germany Abstract In this paper we consider the problem

More information

Real-Time Scene Reconstruction. Remington Gong Benjamin Harris Iuri Prilepov

Real-Time Scene Reconstruction. Remington Gong Benjamin Harris Iuri Prilepov Real-Time Scene Reconstruction Remington Gong Benjamin Harris Iuri Prilepov June 10, 2010 Abstract This report discusses the implementation of a real-time system for scene reconstruction. Algorithms for

More information

Stereo and Epipolar geometry

Stereo and Epipolar geometry Previously Image Primitives (feature points, lines, contours) Today: Stereo and Epipolar geometry How to match primitives between two (multiple) views) Goals: 3D reconstruction, recognition Jana Kosecka

More information

EVOLUTION OF POINT CLOUD

EVOLUTION OF POINT CLOUD Figure 1: Left and right images of a stereo pair and the disparity map (right) showing the differences of each pixel in the right and left image. (source: https://stackoverflow.com/questions/17607312/difference-between-disparity-map-and-disparity-image-in-stereo-matching)

More information

Regular Paper Challenges in wide-area structure-from-motion

Regular Paper Challenges in wide-area structure-from-motion Regular Paper Challenges in wide-area structure-from-motion Marc Pollefeys (Institute of Visual Computing, ETH Zurich) (Dept. of Computer Science, University of North Carolina at Chapel Hill) Jan-Michael

More information

Multiple View Geometry

Multiple View Geometry Multiple View Geometry Martin Quinn with a lot of slides stolen from Steve Seitz and Jianbo Shi 15-463: Computational Photography Alexei Efros, CMU, Fall 2007 Our Goal The Plenoptic Function P(θ,φ,λ,t,V

More information

3D Modeling of Objects Using Laser Scanning

3D Modeling of Objects Using Laser Scanning 1 3D Modeling of Objects Using Laser Scanning D. Jaya Deepu, LPU University, Punjab, India Email: Jaideepudadi@gmail.com Abstract: In the last few decades, constructing accurate three-dimensional models

More information

3D Models from Extended Uncalibrated Video Sequences: Addressing Key-frame Selection and Projective Drift

3D Models from Extended Uncalibrated Video Sequences: Addressing Key-frame Selection and Projective Drift 3D Models from Extended Uncalibrated Video Sequences: Addressing Key-frame Selection and Projective Drift Jason Repko Department of Computer Science University of North Carolina at Chapel Hill repko@csuncedu

More information

Estimation of common groundplane based on co-motion statistics

Estimation of common groundplane based on co-motion statistics Estimation of common groundplane based on co-motion statistics Zoltan Szlavik, Laszlo Havasi 2, Tamas Sziranyi Analogical and Neural Computing Laboratory, Computer and Automation Research Institute of

More information

Camera Registration in a 3D City Model. Min Ding CS294-6 Final Presentation Dec 13, 2006

Camera Registration in a 3D City Model. Min Ding CS294-6 Final Presentation Dec 13, 2006 Camera Registration in a 3D City Model Min Ding CS294-6 Final Presentation Dec 13, 2006 Goal: Reconstruct 3D city model usable for virtual walk- and fly-throughs Virtual reality Urban planning Simulation

More information

A Survey of Light Source Detection Methods

A Survey of Light Source Detection Methods A Survey of Light Source Detection Methods Nathan Funk University of Alberta Mini-Project for CMPUT 603 November 30, 2003 Abstract This paper provides an overview of the most prominent techniques for light

More information

GRAPHICS TOOLS FOR THE GENERATION OF LARGE SCALE URBAN SCENES

GRAPHICS TOOLS FOR THE GENERATION OF LARGE SCALE URBAN SCENES GRAPHICS TOOLS FOR THE GENERATION OF LARGE SCALE URBAN SCENES Norbert Haala, Martin Kada, Susanne Becker, Jan Böhm, Yahya Alshawabkeh University of Stuttgart, Institute for Photogrammetry, Germany Forename.Lastname@ifp.uni-stuttgart.de

More information

IMAGE-BASED 3D ACQUISITION TOOL FOR ARCHITECTURAL CONSERVATION

IMAGE-BASED 3D ACQUISITION TOOL FOR ARCHITECTURAL CONSERVATION IMAGE-BASED 3D ACQUISITION TOOL FOR ARCHITECTURAL CONSERVATION Joris Schouteden, Marc Pollefeys, Maarten Vergauwen, Luc Van Gool Center for Processing of Speech and Images, K.U.Leuven, Kasteelpark Arenberg

More information

3D Reconstruction of Outdoor Environments from Omnidirectional Range and Color Images

3D Reconstruction of Outdoor Environments from Omnidirectional Range and Color Images 3D Reconstruction of Outdoor Environments from Omnidirectional Range and Color Images Toshihiro ASAI, Masayuki KANBARA, Naokazu YOKOYA Graduate School of Information Science, Nara Institute of Science

More information

Structured light 3D reconstruction

Structured light 3D reconstruction Structured light 3D reconstruction Reconstruction pipeline and industrial applications rodola@dsi.unive.it 11/05/2010 3D Reconstruction 3D reconstruction is the process of capturing the shape and appearance

More information

Passive 3D Photography

Passive 3D Photography SIGGRAPH 2000 Course on 3D Photography Passive 3D Photography Steve Seitz Carnegie Mellon University University of Washington http://www.cs cs.cmu.edu/~ /~seitz Visual Cues Shading Merle Norman Cosmetics,

More information

Dense 3-D Reconstruction of an Outdoor Scene by Hundreds-baseline Stereo Using a Hand-held Video Camera

Dense 3-D Reconstruction of an Outdoor Scene by Hundreds-baseline Stereo Using a Hand-held Video Camera Dense 3-D Reconstruction of an Outdoor Scene by Hundreds-baseline Stereo Using a Hand-held Video Camera Tomokazu Satoy, Masayuki Kanbaray, Naokazu Yokoyay and Haruo Takemuraz ygraduate School of Information

More information

Robust 6DOF Motion Estimation for Non-Overlapping, Multi-Camera Systems

Robust 6DOF Motion Estimation for Non-Overlapping, Multi-Camera Systems Robust 6DOF Motion Estimation for Non-Overlapping, Multi-Camera Systems Brian Clipp 1, Jae-Hak Kim 2, Jan-Michael Frahm 1, Marc Pollefeys 3 and Richard Hartley 2 1 Department of Computer Science 2 Research

More information

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov Structured Light II Johannes Köhler Johannes.koehler@dfki.de Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov Introduction Previous lecture: Structured Light I Active Scanning Camera/emitter

More information

Fast Natural Feature Tracking for Mobile Augmented Reality Applications

Fast Natural Feature Tracking for Mobile Augmented Reality Applications Fast Natural Feature Tracking for Mobile Augmented Reality Applications Jong-Seung Park 1, Byeong-Jo Bae 2, and Ramesh Jain 3 1 Dept. of Computer Science & Eng., University of Incheon, Korea 2 Hyundai

More information

Visual Odometry for Non-Overlapping Views Using Second-Order Cone Programming

Visual Odometry for Non-Overlapping Views Using Second-Order Cone Programming Visual Odometry for Non-Overlapping Views Using Second-Order Cone Programming Jae-Hak Kim 1, Richard Hartley 1, Jan-Michael Frahm 2 and Marc Pollefeys 2 1 Research School of Information Sciences and Engineering

More information

Stereo Vision. MAN-522 Computer Vision

Stereo Vision. MAN-522 Computer Vision Stereo Vision MAN-522 Computer Vision What is the goal of stereo vision? The recovery of the 3D structure of a scene using two or more images of the 3D scene, each acquired from a different viewpoint in

More information

Real-time Generation and Presentation of View-dependent Binocular Stereo Images Using a Sequence of Omnidirectional Images

Real-time Generation and Presentation of View-dependent Binocular Stereo Images Using a Sequence of Omnidirectional Images Real-time Generation and Presentation of View-dependent Binocular Stereo Images Using a Sequence of Omnidirectional Images Abstract This paper presents a new method to generate and present arbitrarily

More information

Measurement of 3D Foot Shape Deformation in Motion

Measurement of 3D Foot Shape Deformation in Motion Measurement of 3D Foot Shape Deformation in Motion Makoto Kimura Masaaki Mochimaru Takeo Kanade Digital Human Research Center National Institute of Advanced Industrial Science and Technology, Japan The

More information

5LSH0 Advanced Topics Video & Analysis

5LSH0 Advanced Topics Video & Analysis 1 Multiview 3D video / Outline 2 Advanced Topics Multimedia Video (5LSH0), Module 02 3D Geometry, 3D Multiview Video Coding & Rendering Peter H.N. de With, Sveta Zinger & Y. Morvan ( p.h.n.de.with@tue.nl

More information

A Desktop 3D Scanner Exploiting Rotation and Visual Rectification of Laser Profiles

A Desktop 3D Scanner Exploiting Rotation and Visual Rectification of Laser Profiles A Desktop 3D Scanner Exploiting Rotation and Visual Rectification of Laser Profiles Carlo Colombo, Dario Comanducci, and Alberto Del Bimbo Dipartimento di Sistemi ed Informatica Via S. Marta 3, I-5139

More information

Multi-Projector Display with Continuous Self-Calibration

Multi-Projector Display with Continuous Self-Calibration Multi-Projector Display with Continuous Self-Calibration Jin Zhou Liang Wang Amir Akbarzadeh Ruigang Yang Graphics and Vision Technology Lab (GRAVITY Lab) Center for Visualization and Virtual Environments,

More information

Multi-view stereo. Many slides adapted from S. Seitz

Multi-view stereo. Many slides adapted from S. Seitz Multi-view stereo Many slides adapted from S. Seitz Beyond two-view stereo The third eye can be used for verification Multiple-baseline stereo Pick a reference image, and slide the corresponding window

More information

Chapters 1 9: Overview

Chapters 1 9: Overview Chapters 1 9: Overview Chapter 1: Introduction Chapters 2 4: Data acquisition Chapters 5 9: Data manipulation Chapter 5: Vertical imagery Chapter 6: Image coordinate measurements and refinements Chapters

More information

Segmentation and Tracking of Partial Planar Templates

Segmentation and Tracking of Partial Planar Templates Segmentation and Tracking of Partial Planar Templates Abdelsalam Masoud William Hoff Colorado School of Mines Colorado School of Mines Golden, CO 800 Golden, CO 800 amasoud@mines.edu whoff@mines.edu Abstract

More information

Measurement of Pedestrian Groups Using Subtraction Stereo

Measurement of Pedestrian Groups Using Subtraction Stereo Measurement of Pedestrian Groups Using Subtraction Stereo Kenji Terabayashi, Yuki Hashimoto, and Kazunori Umeda Chuo University / CREST, JST, 1-13-27 Kasuga, Bunkyo-ku, Tokyo 112-8551, Japan terabayashi@mech.chuo-u.ac.jp

More information

Dense Tracking and Mapping for Autonomous Quadrocopters. Jürgen Sturm

Dense Tracking and Mapping for Autonomous Quadrocopters. Jürgen Sturm Computer Vision Group Prof. Daniel Cremers Dense Tracking and Mapping for Autonomous Quadrocopters Jürgen Sturm Joint work with Frank Steinbrücker, Jakob Engel, Christian Kerl, Erik Bylow, and Daniel Cremers

More information

Step-by-Step Model Buidling

Step-by-Step Model Buidling Step-by-Step Model Buidling Review Feature selection Feature selection Feature correspondence Camera Calibration Euclidean Reconstruction Landing Augmented Reality Vision Based Control Sparse Structure

More information

Live Metric 3D Reconstruction on Mobile Phones ICCV 2013

Live Metric 3D Reconstruction on Mobile Phones ICCV 2013 Live Metric 3D Reconstruction on Mobile Phones ICCV 2013 Main Contents 1. Target & Related Work 2. Main Features of This System 3. System Overview & Workflow 4. Detail of This System 5. Experiments 6.

More information

Computer Vision Lecture 17

Computer Vision Lecture 17 Computer Vision Lecture 17 Epipolar Geometry & Stereo Basics 13.01.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar in the summer semester

More information

Computer Vision Lecture 17

Computer Vision Lecture 17 Announcements Computer Vision Lecture 17 Epipolar Geometry & Stereo Basics Seminar in the summer semester Current Topics in Computer Vision and Machine Learning Block seminar, presentations in 1 st week

More information

Comparing Aerial Photogrammetry and 3D Laser Scanning Methods for Creating 3D Models of Complex Objects

Comparing Aerial Photogrammetry and 3D Laser Scanning Methods for Creating 3D Models of Complex Objects Comparing Aerial Photogrammetry and 3D Laser Scanning Methods for Creating 3D Models of Complex Objects A Bentley Systems White Paper Cyril Novel Senior Software Engineer, Bentley Systems Renaud Keriven

More information

DEPTH AND GEOMETRY FROM A SINGLE 2D IMAGE USING TRIANGULATION

DEPTH AND GEOMETRY FROM A SINGLE 2D IMAGE USING TRIANGULATION 2012 IEEE International Conference on Multimedia and Expo Workshops DEPTH AND GEOMETRY FROM A SINGLE 2D IMAGE USING TRIANGULATION Yasir Salih and Aamir S. Malik, Senior Member IEEE Centre for Intelligent

More information

Multi-view Stereo. Ivo Boyadzhiev CS7670: September 13, 2011

Multi-view Stereo. Ivo Boyadzhiev CS7670: September 13, 2011 Multi-view Stereo Ivo Boyadzhiev CS7670: September 13, 2011 What is stereo vision? Generic problem formulation: given several images of the same object or scene, compute a representation of its 3D shape

More information

Computational Optical Imaging - Optique Numerique. -- Single and Multiple View Geometry, Stereo matching --

Computational Optical Imaging - Optique Numerique. -- Single and Multiple View Geometry, Stereo matching -- Computational Optical Imaging - Optique Numerique -- Single and Multiple View Geometry, Stereo matching -- Autumn 2015 Ivo Ihrke with slides by Thorsten Thormaehlen Reminder: Feature Detection and Matching

More information

Real-Time Video-Based Rendering from Multiple Cameras

Real-Time Video-Based Rendering from Multiple Cameras Real-Time Video-Based Rendering from Multiple Cameras Vincent Nozick Hideo Saito Graduate School of Science and Technology, Keio University, Japan E-mail: {nozick,saito}@ozawa.ics.keio.ac.jp Abstract In

More information

Using temporal seeding to constrain the disparity search range in stereo matching

Using temporal seeding to constrain the disparity search range in stereo matching Using temporal seeding to constrain the disparity search range in stereo matching Thulani Ndhlovu Mobile Intelligent Autonomous Systems CSIR South Africa Email: tndhlovu@csir.co.za Fred Nicolls Department

More information

Lecture 10: Multi-view geometry

Lecture 10: Multi-view geometry Lecture 10: Multi-view geometry Professor Stanford Vision Lab 1 What we will learn today? Review for stereo vision Correspondence problem (Problem Set 2 (Q3)) Active stereo vision systems Structure from

More information

Morphable 3D-Mosaics: a Hybrid Framework for Photorealistic Walkthroughs of Large Natural Environments

Morphable 3D-Mosaics: a Hybrid Framework for Photorealistic Walkthroughs of Large Natural Environments Morphable 3D-Mosaics: a Hybrid Framework for Photorealistic Walkthroughs of Large Natural Environments Nikos Komodakis and Georgios Tziritas Computer Science Department, University of Crete E-mails: {komod,

More information

3-D Shape Reconstruction from Light Fields Using Voxel Back-Projection

3-D Shape Reconstruction from Light Fields Using Voxel Back-Projection 3-D Shape Reconstruction from Light Fields Using Voxel Back-Projection Peter Eisert, Eckehard Steinbach, and Bernd Girod Telecommunications Laboratory, University of Erlangen-Nuremberg Cauerstrasse 7,

More information

Lecture 10: Multi view geometry

Lecture 10: Multi view geometry Lecture 10: Multi view geometry Professor Fei Fei Li Stanford Vision Lab 1 What we will learn today? Stereo vision Correspondence problem (Problem Set 2 (Q3)) Active stereo vision systems Structure from

More information

A Statistical Consistency Check for the Space Carving Algorithm.

A Statistical Consistency Check for the Space Carving Algorithm. A Statistical Consistency Check for the Space Carving Algorithm. A. Broadhurst and R. Cipolla Dept. of Engineering, Univ. of Cambridge, Cambridge, CB2 1PZ aeb29 cipolla @eng.cam.ac.uk Abstract This paper

More information

Video Georegistration: Key Challenges. Steve Blask Harris Corporation GCSD Melbourne, FL 32934

Video Georegistration: Key Challenges. Steve Blask Harris Corporation GCSD Melbourne, FL 32934 Video Georegistration: Key Challenges Steve Blask sblask@harris.com Harris Corporation GCSD Melbourne, FL 32934 Definitions Registration: image to image alignment Find pixel-to-pixel correspondences between

More information

Data Acquisition, Leica Scan Station 2, Park Avenue and 70 th Street, NY

Data Acquisition, Leica Scan Station 2, Park Avenue and 70 th Street, NY Automated registration of 3D-range with 2D-color images: an overview 44 th Annual Conference on Information Sciences and Systems Invited Session: 3D Data Acquisition and Analysis March 19 th 2010 Ioannis

More information

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University. 3D Computer Vision Structured Light II Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de 1 Introduction

More information

Large Scale 3D Reconstruction by Structure from Motion

Large Scale 3D Reconstruction by Structure from Motion Large Scale 3D Reconstruction by Structure from Motion Devin Guillory Ziang Xie CS 331B 7 October 2013 Overview Rome wasn t built in a day Overview of SfM Building Rome in a Day Building Rome on a Cloudless

More information

Comparing Aerial Photogrammetry and 3D Laser Scanning Methods for Creating 3D Models of Complex Objects

Comparing Aerial Photogrammetry and 3D Laser Scanning Methods for Creating 3D Models of Complex Objects www.bentley.com Comparing Aerial Photogrammetry and 3D Laser Scanning Methods for Creating 3D Models of Complex Objects A Bentley White Paper Cyril Novel Senior Software Engineer, Bentley Systems Renaud

More information

Model Refinement from Planar Parallax

Model Refinement from Planar Parallax Model Refinement from Planar Parallax A. R. Dick R. Cipolla Department of Engineering, University of Cambridge, Cambridge, UK {ard28,cipolla}@eng.cam.ac.uk Abstract This paper presents a system for refining

More information

Stable Vision-Aided Navigation for Large-Area Augmented Reality

Stable Vision-Aided Navigation for Large-Area Augmented Reality Stable Vision-Aided Navigation for Large-Area Augmented Reality Taragay Oskiper, Han-Pang Chiu, Zhiwei Zhu Supun Samarasekera, Rakesh Teddy Kumar Vision and Robotics Laboratory SRI-International Sarnoff,

More information

DENSE 3D POINT CLOUD GENERATION FROM UAV IMAGES FROM IMAGE MATCHING AND GLOBAL OPTIMAZATION

DENSE 3D POINT CLOUD GENERATION FROM UAV IMAGES FROM IMAGE MATCHING AND GLOBAL OPTIMAZATION DENSE 3D POINT CLOUD GENERATION FROM UAV IMAGES FROM IMAGE MATCHING AND GLOBAL OPTIMAZATION S. Rhee a, T. Kim b * a 3DLabs Co. Ltd., 100 Inharo, Namgu, Incheon, Korea ahmkun@3dlabs.co.kr b Dept. of Geoinformatic

More information

Long-term motion estimation from images

Long-term motion estimation from images Long-term motion estimation from images Dennis Strelow 1 and Sanjiv Singh 2 1 Google, Mountain View, CA, strelow@google.com 2 Carnegie Mellon University, Pittsburgh, PA, ssingh@cmu.edu Summary. Cameras

More information

Mobile Point Fusion. Real-time 3d surface reconstruction out of depth images on a mobile platform

Mobile Point Fusion. Real-time 3d surface reconstruction out of depth images on a mobile platform Mobile Point Fusion Real-time 3d surface reconstruction out of depth images on a mobile platform Aaron Wetzler Presenting: Daniel Ben-Hoda Supervisors: Prof. Ron Kimmel Gal Kamar Yaron Honen Supported

More information

DERIVING PEDESTRIAN POSITIONS FROM UNCALIBRATED VIDEOS

DERIVING PEDESTRIAN POSITIONS FROM UNCALIBRATED VIDEOS DERIVING PEDESTRIAN POSITIONS FROM UNCALIBRATED VIDEOS Zoltan Koppanyi, Post-Doctoral Researcher Charles K. Toth, Research Professor The Ohio State University 2046 Neil Ave Mall, Bolz Hall Columbus, OH,

More information

EE795: Computer Vision and Intelligent Systems

EE795: Computer Vision and Intelligent Systems EE795: Computer Vision and Intelligent Systems Spring 2012 TTh 17:30-18:45 FDH 204 Lecture 14 130307 http://www.ee.unlv.edu/~b1morris/ecg795/ 2 Outline Review Stereo Dense Motion Estimation Translational

More information

3D Modeling of Complex environments

3D Modeling of Complex environments SPIE Proceedings Vol 4309, Videometrics VII, San Jose, Jan 21-26, 2001 3D Modeling of Complex environments Sabry F. El-Hakim Visual Information Technology (VIT) Group Institute For Information Technology,

More information

Dense 3D Reconstruction from Autonomous Quadrocopters

Dense 3D Reconstruction from Autonomous Quadrocopters Dense 3D Reconstruction from Autonomous Quadrocopters Computer Science & Mathematics TU Munich Martin Oswald, Jakob Engel, Christian Kerl, Frank Steinbrücker, Jan Stühmer & Jürgen Sturm Autonomous Quadrocopters

More information

Inexpensive Construction of a 3D Face Model from Stereo Images

Inexpensive Construction of a 3D Face Model from Stereo Images Inexpensive Construction of a 3D Face Model from Stereo Images M. Shahriar Hossain, Monika Akbar and J. Denbigh Starkey Department of Computer Science, Montana State University, Bozeman, MT 59717, USA.

More information

3D Reconstruction from Scene Knowledge

3D Reconstruction from Scene Knowledge Multiple-View Reconstruction from Scene Knowledge 3D Reconstruction from Scene Knowledge SYMMETRY & MULTIPLE-VIEW GEOMETRY Fundamental types of symmetry Equivalent views Symmetry based reconstruction MUTIPLE-VIEW

More information

Capturing and View-Dependent Rendering of Billboard Models

Capturing and View-Dependent Rendering of Billboard Models Capturing and View-Dependent Rendering of Billboard Models Oliver Le, Anusheel Bhushan, Pablo Diaz-Gutierrez and M. Gopi Computer Graphics Lab University of California, Irvine Abstract. In this paper,

More information

Vehicle Dimensions Estimation Scheme Using AAM on Stereoscopic Video

Vehicle Dimensions Estimation Scheme Using AAM on Stereoscopic Video Workshop on Vehicle Retrieval in Surveillance (VRS) in conjunction with 2013 10th IEEE International Conference on Advanced Video and Signal Based Surveillance Vehicle Dimensions Estimation Scheme Using

More information

DETECTION OF 3D POINTS ON MOVING OBJECTS FROM POINT CLOUD DATA FOR 3D MODELING OF OUTDOOR ENVIRONMENTS

DETECTION OF 3D POINTS ON MOVING OBJECTS FROM POINT CLOUD DATA FOR 3D MODELING OF OUTDOOR ENVIRONMENTS DETECTION OF 3D POINTS ON MOVING OBJECTS FROM POINT CLOUD DATA FOR 3D MODELING OF OUTDOOR ENVIRONMENTS Tsunetake Kanatani,, Hideyuki Kume, Takafumi Taketomi, Tomokazu Sato and Naokazu Yokoya Hyogo Prefectural

More information

Exterior Orientation Parameters

Exterior Orientation Parameters Exterior Orientation Parameters PERS 12/2001 pp 1321-1332 Karsten Jacobsen, Institute for Photogrammetry and GeoInformation, University of Hannover, Germany The georeference of any photogrammetric product

More information

3D Modeling of Outdoor Environments by Integrating Omnidirectional Range and Color Images

3D Modeling of Outdoor Environments by Integrating Omnidirectional Range and Color Images 3D Modeling of Outdoor Environments by Integrating Omnidirectional Range and Color Images Toshihiro ASAI, Masayuki KANBARA, Naokazu YOKOYA Nara Institute of Science and Technology 8916-5 Takayama Ikoma

More information

[Youn *, 5(11): November 2018] ISSN DOI /zenodo Impact Factor

[Youn *, 5(11): November 2018] ISSN DOI /zenodo Impact Factor GLOBAL JOURNAL OF ENGINEERING SCIENCE AND RESEARCHES AUTOMATIC EXTRACTING DEM FROM DSM WITH CONSECUTIVE MORPHOLOGICAL FILTERING Junhee Youn *1 & Tae-Hoon Kim 2 *1,2 Korea Institute of Civil Engineering

More information

Modeling, Combining, and Rendering Dynamic Real-World Events From Image Sequences

Modeling, Combining, and Rendering Dynamic Real-World Events From Image Sequences Modeling, Combining, and Rendering Dynamic Real-World Events From Image s Sundar Vedula, Peter Rander, Hideo Saito, and Takeo Kanade The Robotics Institute Carnegie Mellon University Abstract Virtualized

More information