A Desktop 3D Scanner Exploiting Rotation and Visual Rectification of Laser Profiles

A Desktop 3D Scanner Exploiting Rotation and Visual Rectification of Laser Profiles Carlo Colombo, Dario Comanducci, and Alberto Del Bimbo Dipartimento di Sistemi ed Informatica Via S. Marta 3, I-5139 Florence, Italy {colombo, comandu, delbimbo}@dsi.unifi.it Abstract. We describe a low cost system for metric 3D scanning from uncalibrated images based on rotational kinematic constraints. The system is composed by a turntable, an off-the-shelf camera and a laser stripe illuminator. System operation is based on the construction of the virtual image of a surface of revolution (SOR), from which two imaged SOR cross-sections are obtained in an automatic way, and internal camera calibration is performed by exploiting the same object being scanned. Shape acquisition is finally obtained by laser profile rectification and collation. Experiments with real data are shown, providing an insight into both camera calibration and shape reconstruction performance. System accuracy appears to be adequate for desktop applications. 1 Introduction The availability of 3D object models has recently disclosed new opportunities in application fields such as medicine, architecture, cultural heritage and so on. The growing demand of 3D models makes necessary the development of low cost acquisition systems. Automatic 3D model acquisition technology has evolved considerably in the last few years overviews of the field can be found in [1], [3]. Several 3D scanning devices are commercially available, and extensive lists of vendors are maintained at various web sites. The most popular techniques are those of time-of-flight laser scanning and image-based 3D reconstruction. The latter ones, which rely on computer vision, can be further subdivided into active and passive methods. Active methods employ structured light projected onto the scene [12]; most of the commercial products (by Cyberware, Minolta, to cite a few) exploit this approach. Structured-light scanners are computationally straightforward and very accurate; yet, they typically require expensive components. On the other hand, passive methods are computationally more challenging, but also less expensive. Examples of passive methods are stereo triangulation [7] and geometric scene constraints [8]. All vision-based methods require the pre-processing step of camera calibration. Calibration algorithms exhibit a trade-off between geometric accuracy and flexibility of use. Very high accuracies are typically required for laboratory or industrial applications, and obtained with special and expensive 3D calibration patterns. Conversely, results

2 C. Colombo et al. from projective geometry have been recently used to develop flexible yet reasonably accurate calibration approaches for desktop vision applications using prior knowledge about scene structure [1] or camera motion [9], [11], [13]. In this paper, we present a desktop laser-based scanning system which combines the good accuracy of structured light approaches with the low cost characteristics of the approaches based on self-calibration. The system is composed by a turntable, an off-the-shelf camera (with zero skew, known aspect ratio, and no radial distortion) and a laser stripe illuminator, which makes visible a vertical slice of the rotating object being acquired (Fig. 1). Camera Laser 3D object Turntable Fig. 1. Scanning system layout. System operation is based on the construction, by the observation of an object rotating on the turntable, of the virtual image of a surface of revolution (SOR), from which camera self-calibration is carried out by using the algorithm described in [5] and exploiting the same object being scanned. 3D shape reconstruction is then obtained by metric rectification and collation of the laser profiles obtained from subsequent frames of the sequence. Differently from other vision-based turntable approaches working in unstructured light conditions (see e.g. [11], [6]), no point triangulation or tracking is needed here, so that it is possible to deal also with textureless objects. The system runs in a completely automatic way, thanks to a multiresolution segmentation algorithm derived from [4] that extracts all the virtual SOR parameters (imaged curves, fixed entities, characteristic homologies) required for the computations. The paper is organized as follows. Section 2 illustrates virtual imaged SOR generation and segmentation. Section 3 addresses system (laser, camera and turntable) calibration from virtual SOR and other image data. Metric 3D reconstruction is discussed in Section 4, and experimental results with real data are provided in Section 5. Finally, concluding remarks are given in Section 6.

A Desktop 3D Scanner 3 2 Virtual Imaged SOR Construction and Analysis Fig. 2(a,b) shows two frames of a video sequence of a general object undergoing turntable motion. In 3D space, the volume swept by the moving object is enclosed by a surface of revolution, referred to as virtual SOR. The image of the virtual SOR (see Fig. 2(c)) is obtained by superposition of the difference between the current and the first frame of the sequence. (c) Fig. 2. Virtual imaged SOR generation from a turntable sequence of an arbitrarilyshaped object.,: Two frames of the sequence. (c): The image of the virtual SOR generated by frame superposition. The virtual SOR image thus obtained is analyzed in order to extract in an automatic way the image of the top and bottom SOR cross-sections (two ellipses), together with the parameters of the projective transformation (harmonic homology) characterizing the imaged SOR symmetry see Fig. 3. The harmonic homology transforms the imaged SOR silhouette onto itself, and is parameterized by an axis l s (2 dof) i.e., the image of the rotation axis and a vertex v (2 dof). The automatic virtual SOR segmentation approach is inspired by [4]. l s v Fig. 3. : The imaged geometry of the virtual SOR. : The estimated ellipses superimposed to Fig. 2.

4 C. Colombo et al. It consists in searching simultaneously for the four parameters of the harmonic homology and for the silhouette point pairs corresponding through it. This is achieved by solving an optimization problem involving edge points extracted from the image according to a multiresolution scheme. A first estimate of the homology is obtained by running the RANdom SAmple Consensus (RANSAC) algorithm at the lowest resolution level of a Gaussian pyramid, where the homology is well approximated by a simple axial symmetry (2 dof). New and better estimates of the full harmonic homology are then obtained by propagating the parameters through all the levels of the Gaussian pyramid, up to the original image. In particular, the homology and the silhouette point pairs are computed from the edges of each level, by the Iterative Closest Point (ICP) algorithm. The last step is to obtain the two elliptical imaged cross-sections from the silhouette point pairs. The results of this segmentation step for the sequence of Fig. 2 are shown in Fig. 3. The segmentation approach exploits the well known tangency condition between each imaged cross-section and the silhouette. This condition allows us to construct a conic pencil for each silhouette point pair, and to look inside all possible the conic pencils for the two ellipses receiving the largest consensus from the silhouette points. 3 System Calibration The virtual imaged SOR entities are strictly related to the image of the absolute conic ω which embeds camera calibration information [7]. In particular, the axis and vertex of the harmonic homology are in pole-polar relationship with respect to this conic: l s = ωv. (1) Moreover, since SOR cross sections are circles in the 3D space orthogonal to the axis of rotation, they all intersect at the circular points of the turntable plane. Their projection in the image, i and j, which can be obtained from the complex conjugate solutions of the intersection between the two imaged cross-sections, are also related to the image of the absolute conic as i T ω i = j T ω j =. (2) The system resulting from eqs. 1 and 2 provides four linear constraints on ω, whose coefficients can be computed as in [5] from the two ellipses extracted before. Since only three out of the four constraints above are actually independent, they can be used to calibrate a natural camera (zero skew and known aspect ratio: 3 dofs) from the virtual SOR image. System calibration also calls for a procedure for laser plane rectification, i.e. the recovery of the vanishing line of the plane including the laser stripe. Assuming that the laser plane is fixed, orthogonal to the turntable plane, and passing through the turntable axis, its vanishing line m is computed as follows. First, the image l b of the laserturntable line at which the laser stripe and the turntable intersect is computed. This is obtained by running the RANSAC algorithm on putative laser points

A Desktop 3D Scanner 5 5 x l l s 5 l b l b 1 m x t x s 15 x b 2 v x s 5 5 1 15 2 25 3 Fig. 4. : Robust laser-turntable line detection. : Computing the vanishing line of the laser plane. extracted by maximum intensity search over image lines see Fig. 4. Now, the turntable vanishing line l can be computed as the line through i and j. This line is used to recover the vanishing point of the rotation axis, v, from the pole-polar relationship l = ωv. (3) Moreover, the vanishing point of the laser-turntable line is computed as x = l l b. Since x and v are the vanishing points of two distinct directions in the laser plane, m can be simply computed as the line through these two points see also Fig. 4. 4 Metric 3D Reconstruction Given the calibration ω, it is possible to rectify any plane for which the vanishing line is known. This property is exploited here to reconstruct an object placed on the rotating turntable by laser profile rectification. To this aim, the laser plane vanishing line m is first intersected with ω, and the imaged circular points of the laser plane, namely i and j, are computed. Hence, the rectifying homography is obtained as described in [7], and eventually metric 3D reconstruction of the object can take place. Fig. 5 illustrates metric rectification for one of the laser profiles obtained at scanning time. In, the profile distorted by perspective projection is shown, together with the virtual SOR entities used for its rectification. In, the rectified profile is shown. It is worth noting that, in order to preserve (up to scale) the original object shape and achieve a correct metric reconstruction, the rectifying homography must be applied to both the profile and the imaged axis of rotation mapped onto the y axis in. Provided that frame acquisition rate is known and turntable speed is constant, the 3D object is constructed by properly placing subsequent rectified profiles at equally

6 C. Colombo et al. x l l s 16 14 12 m 1 8 6 x b 4 v 2 4 6 2 Fig. 5. : The geometry for laser profile rectification. : The rectified profile. spaced angles. The method above supports reconstruction up to a scale factor. If a Euclidean reconstruction is required, the scale factor can be fixed given one length in the scene. The height of the virtual SOR (which is also the height of the object used to generate it) can be conveniently used for this purpose. Indeed, Fig. 4 shows that the imaged centers x t and x b of the top and bottom cross-sections of the virtual SOR can be used, after rectification of the imaged rotation axis, to fix the scaling factor. The two points are easily computed from their pole-polar relationship with vanishing line l via their associated ellipse. 5 Experimental Results In order to assess the performance of the scanning system, real-world tests were carried out. The accuracy for both the calibration and reconstruction tasks was measured. 5.1 Calibration accuracy Tab. 1 reports the results of a real-world experiment. The ground truth for the experiment was computed with a 3D calibration grid and the standard Tsai algorithm adapted for the natural camera model. The table reports the ground truth vs estimated values and the error percentage for each of the internal calibration parameters. It is worth noting that the principal point is more sensitive w.r.t. noise than the focal length. This may be explained by the fact, reported in the literature, that the accuracy of the principal point (but not that of the focal length) depends not only on image noise, but also on the relative position of the imaged SOR axis w.r.t. the principal point itself. In particular, the estimation uncertainty increases as the imaged axis of symmetry gets closer to the principal point.

A Desktop 3D Scanner 7 5.2 Reconstruction accuracy In order to assess reconstruction accuracy in a quantitative way, the rectangular box of known dimensions in Fig. 6 was used. Fig. 6 shows the point cloud model obtained with our scanning system. The model points are arranged according to a circular pattern, which reflects the acquisition scheme used. Fig. 6. : A rectangular box of known size. : The reconstructed point cloud model. Tabs. 2 and 3 provide a quantitative insight into reconstruction performance. In particular, Tab. 2 offers a comparison between real and measured object lengths (in mm). Reconstructed box dimensions are obtained by fitting a least square box on the point cloud (see Fig. 7). The box height is estimated by running the Least Median of Squares (LMS) algorithm on the z-distribution of the top face points, to reduce outlier influence outliers are represented as dark points in Fig. 7. Tab. 3 shows the computed angles between pairs of object faces (ground truth values are 9 and 18 degrees). Faces were estimated from the point cloud via planar least squares fitting. Angular errors range between and 2 degrees. Reconstruction results for a more complex objects are shown in Fig. 8. The model acquired reproduces the shape and proportions of the original object. Yet, due to the presence of self-occlusions, not all the points of the object surface were acquired (this is particularly evident for the area below the boy s jacket). A lowocclusion approach could be used to find the best object position for scanning, and reduce this kind of problem [2]. Table 1. Calibration with a turntable sequence. Parameter Ground Truth Estimate Error (%) focal length 398.46 39.17 2.8 x principal point 167.22 186.62 11.6 y principal point 121.7 98.6 19.1

8 C. Colombo et al. 6 5 4 3 2 1 1 2 3 4 5 6 9 8 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 55 5 45 4 35 3 25 2 15 1 5 9 8 7 6 5 4 3 2 1 55 5 45 4 35 3 25 2 15 1 5 9 8 7 6 5 4 3 2 1 (c) Fig. 7. Box point cloud and least square box (green lines); z-outlier points are shown in dark (blue color). : Top view., (c): Side views.

A Desktop 3D Scanner 9 Table 2. Box acquisition experiment: results on lengths (in mm). Dimension Ground Truth Estimate Error (%) Height 44. 45.19 2.7 Width 127. 127.72.1 Depth 175. 175.2.57 Table 3. Box acquisition experiment: results on angles (in degrees). Face Side 1 Side 2 Side 3 Side 4 Upper 91.6 9.2 91.3 9. Side 1 178. 89.7 9.7 Side 2 89.3 9.2 Side 3 178.6 6 Conclusions and Future Work We have presented a vision-based system for low cost 3D scanning using a turntable and a laser stripe. The system exploits rotational constraints for camera calibration and object reconstruction by profile rectification. Experimental results demonstrate the effectiveness of the approach for desktop applications. Future work will address the removal of the constancy constraint set on both turntable speed and image acquisition rate. Another constraint that will be relaxed is that of laser plane position and orientation. References 1. F. Bernardini and H.E. Rushmeier. The 3D Model Acquisition Pipeline. Computer Graphics Forum, 21(2):149 172, 22. 2. B.-T. Chen, W.-S. Lou, C.-C. Chen and H.-C. Lin. A 3D Scanning System based on Low-Occlusion Approach. Proc. 2nd International Conference on 3D Digital Imaging and Modeling(3DIM 99), pages. 56 515, 1999. 3. F. Chen, G. M. Brown, and M. Song. Overview of three dimensional shape measurements using optical methods. Optical Engineering, 39(1):1 22, 2. 4. C. Colombo, D. Comanducci, A. Del Bimbo, and F. Pernici. Accurate automatic localization of surfaces of revolution for self-calibration and metric reconstruction. In Proc. IEEE Workshop on Perceptual Organization in Computer Vision, 24. 5. C. Colombo, A. Del Bimbo, and F. Pernici. Metric 3D reconstruction and texture acquisition of surfaces of revolution from a single uncalibrated view. IEEE Trans. on PAMI, 27(1):99 114, 25. 6. A.W. Fitzgibbon, G. Cross, and A. Zisserman. Automatic 3D model construction for turn-table sequences. In R. Koch and L. Van Gool, eds., 3D Structure from Multiple Images of Large-Scale Environments, pages 155 17. Springer Verlag, 1998. 7. R.I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, 2nd ed. 24. 8. D. Liebowitz, A. Criminisi, and A. Zisserman. Creating architectural models from images. In Proc. EuroGraphics, vol. 18, pages 39 5, 1999.

1 C. Colombo et al. 9. P.R.S. Mendonça, K.-Y.K. Wong, and R. Cipolla. Epipolar geometry from profiles under circular motion. IEEE Trans. on PAMI, 23(6):64 616, 21. 1. M. Pollefeys. Self-calibration and metric 3D reconstruction from uncalibrated image sequences. PhD thesis, K.U. Leuven, 1999. 11. L. Quan, G. Jiang, H.T. Tsui and A. Zisserman. Geometry of single axis motions using conic fitting. IEEE Trans. on PAMI, 25(1):1343 1348, 23. 12. C. Rocchini, P. Cignoni, C. Montani, and R. Scopigno. A low cost 3D scanner based on structured light. Computer Graphics Forum (Eurographics 21 Conf. Issue), 2(3):299 38, 21. 13. J.Y. Zheng. Acquiring 3D models from sequences of contours. IEEE Trans. on PAMI, 16(2):163 178, 1994. (c) Fig. 8. : Four views of a complex object. : Point cloud model. (c): Solid model.