Fast Window Based Stereo Matching for 3D Scene Reconstruction

The International Arab Journal of Information Technology, Vol. 0, No. 3, May 203 209 Fast Winow Base Stereo Matching for 3D Scene Reconstruction Mohamma Mozammel Chowhury an Mohamma AL-Amin Bhuiyan Department of Computer Science an Engineering, Jahangirnagar University, Banglaesh Abstract: Stereo corresponence matching is a key problem in many applications like computer an robotic vision to etermine Three-Dimensional (3D epth information of objects which is essential for 3D reconstruction. This paper presents a 3D reconstruction technique with a fast stereo corresponence matching which is robust in tackling aitive noise. The aitive noise is eliminate using a fuzzy filtering technique. Experimentations with groun truth images prove the effectiveness of the propose algorithms. Keywors: Stereo corresponence, isparity, winow cost, stereo vision, fuzzy filtering, 3D moel. Receive February 2, 200; accepte October 24, 200; publishe online March, 202. Introuction In computer an robot vision, stereo corresponence matching plays an important role. With the help of stereo corresponence algorithms stereo vision systems can be automatize. Stereo corresponence techniques are useful for machine vision applications to etermine Three-Dimensional (3D epth information of objects. They are require in applications like 3D reconstruction, robot navigation an control, autonomous vehicles, 3D growth monitoring, stereo enoscopy, an so on. Several stereo corresponence algorithms have been evelope to fin matching pairs from two image sequences [, 2, 3, 7]. But they exhibit very high computational cost. The extremely long computational time require to match stereo images is the main obstacle on the way to the practical application of stereo vision systems. However, stereo vision applications are often subject to strict real-time requirements. In applications, such as robotics, where the environment being moele is continuously changing, these operations must also be fast to allow a continuous upate of the matching set, from which 3D information is extracte. Approaches to the stereo corresponence matching can be broaly classifie into two categories:. Feature-base, 2. Intensity-base matching techniques. In the feature-base approaches, features such as, ege elements, corners, line segments, curve segments etc., are first extracte from the images, an then the matching process is applie to the attributes associate with the etecte features. The main problem of this approach is that ege elements an corners are easy to etect, but may suffer from occlusion; line an curve segments require extra computation time, but are more robust against occlusion. In the intensity-base matching, the matching process is applie irectly to the intensity profiles of the two images. In this metho, as the two cameras are place on the same horizontal baseline, the corresponing points in both images must lie on the same horizontal scanline. Such stereo configurations reuce the search for corresponences from two imensions (the entire image to one-imension. Unfortunately, like all constraine optimization problems, whether the system woul converge to the global minima is still an open problem. An alternative approach in intensity-base stereo matching, commonly known as the winow-base metho, where matching is establishe on those regions in the images that are interesting; for instance, regions that contain high variation of intensity values in the horizontal, vertical, an iagonal irections. After the interesting regions are etecte, a simple correlation scheme is applie in the matching process; a match is assigne to regions that are highly correlate in the two images. The problem associate with this winow-base approach is that the size of the correlation winows must be carefully chosen. If the correlation winows are too small, the intensity variation in the winows will not be istinctive enough, an many false matches may result. If they are too large, resolution is lost, since neighboring image regions with ifferent isparities will be combine in the measurement []. To overcome the limitations of these stereo corresponence algorithms this research proposes a fast winow base stereo corresponence metho for 3D scene reconstruction. In aition, a major task in the fiel of stereo vision is to extract information from stereo images corrupte by noise. For this purpose, a fuzzy filtering technique has been implemente.

20 The International Arab Journal of Information Technology, Vol. 0, No. 3, May 203 This paper is organize as follows: Section 2 escribes the basic principle of stereo corresponance estimation an 3D reconstruction. Section 3 then escribes our propose metho for noise filtering. Section 4 presents the process of etermining stereo isparity. The algorithm for calculating winow cost an isparity is illustrate in section 5. In section 6, we have presente our propose moel for 3D scene reconstruction. Section 7 shows the experimental results an iscussions. Finally, section 8 conclues the paper. istance between the left an right cameras, an f is the focal length of the camera lens. Figure 2 shows that the two images of an object are obtaine by the left an right cameras observing a common scene. This pair of stereo images allows us to obtain the 3D information about the object. Once we have obtaine a istance map of the scene, we can then measure the shape an volume of objects [2, 3]. Object 2. Stereo Corresponence an 3D Reconstruction z The overall stereo vision process for 3D reconstruction is illustrate in Figure. Left Camera f f Right Camera Input Image Pair Fuzzy Filtering Corresponence Matching Disparity Map 3D Scene Reconstruction Figure. Funamental steps employe in stereo vision. There are two problems associate with stereo vision:. The corresponence problem-in which features corresponing to the same entities in 3D are to be matche across the image frames. 2. The reconstruction problem-in which 3D information is to be reconstructe from the corresponences. In stereo vision, two images of the same scene are taken from slightly ifferent viewpoints using two cameras, place in the same lateral plane. For most pixels in the left image there is a corresponing pixel in the right image in the same horizontal line. Fining the same points in two images is calle corresponence matching an is the funamental computation task unerlying stereo vision. The ifference in the coorinates of the corresponing pixels is known as isparity, which is inversely proportional to the istance of the objects from the camera. Using the isparity, the epth can be efine by the following equation: bf z = ( where, is the isparity, z is the istance of the object point from the camera (the epth, b is the base Figure 2. Stereo vision using two cameras place in the same lateral plane. 3. Noise Elimination with Fuzzy Filtering b Base Line To extract information from stereo images corrupte by noise, we have employe a special fuzzy filtering metho. This filter employs fuzzy rules for eciing the gray level of a pixel within a winow in the image. This is a variation of the Meian filter an Neighborhoo Averaging filter with fuzzy values. The algorithm of fuzzy filtering inclues the following steps:. The gray values of the neighborhoo pixels (n n winow are store in an array an then sorte in ascening or escening orer. 2. Fuzzy membership value is assigne for each neighbor pixels. This step has the following characteristics: a. A Π-shape membership function is use. b. The highest an lowest gray values get the membership value 0. c. Membership value is assigne to the mean value of the gray levels of the neighborhoo pixels. 3. We consier only 2 k+ pixels (k/2<=n 2 in the sorte pixels, an they are the meian gray value an k previous an forwar gray values in the sorte list. 4. The gray value that has the highest membership value will be selecte an place as output. Figure 3 shows a test image with aitive noise an the corresponing output image after Fuzzy filtering. The memberships function for the gray levels in the test image is shown in Figure 4.

Fast Winow Base Stereo Matching for 3D Scene Reconstruction 2 a Original image. b Noise image. c Fuzzy filtere image. Figure 3. Test image, noisy image an its corresponing output image after fuzzy filtering (using a 3 3 mask. Membership value 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0. 0 Meian -0. 0 2 4 6 8 0 Pixels Figure 4. Memberships function for the gray levels in the test image. 4. Estimation of Stereo Disparity Selecte Stereo corresponence or isparity is conventionally etermine base on matching winows of pixels by using Sum of Square Differences (SSD, Sum of Absolute Differences (SAD, or normalize correlation techniques [4, 5, 6, 5]. To etermine the corresponence of a pixel in the left image using SSD, SAD or normalize correlation techniques, the winow costs are compute for all caniate pixels in the right image within the search range. The pixel in the right image that gives the best winow cost is the corresponing pixel of the left image. The basics of stereo corresponence matching are as follows: For each epipolar line For each pixel in the left image compare with every pixel on same epipolar line in right image pick pixel with minimum match cost Winow-base stereo matching technique is wiely use ue to its efficiency an ease of implementation. However, Barnar an Fishler [2] point out a problem in the selection of a winow with fixe size an shape. Many researchers propose aaptive winow methos using winows of ifferent shapes an size epening on local variations of intensity an isparity [0, 4]. But in aaptive winow algorithms, the computation time is relatively higher than that of fixe winow algorithms [8, 9, 6, 7]. To overcome this problem an to achieve a substantial gain in accuracy with less expense of computation time we have propose a fast an very simple algorithm. Some applications, like autonomous vehicle an robot navigation, virtual reality an stereo image coing in 3D-TV, require a very fast estimation of ense stereo corresponence. It is aime that this pruning proposal will be useful in such situations for speey etermination of ense isparity. 5. Winow Cost an Disparity Estimation Algorithm With a view to estimate ense isparity, for every pixel in the left image our goal is to fin the corresponing pixel in the right image. We assume that the stereo images are rectifie, which means that the corresponing epipolar lines are horizontal an on the same height. In winow-base algorithms, to etermine the corresponence of a pixel in the left image the winow costs, which are the SSD or the SAD or the normalize correlation values, nee to be compute for all caniate pixels in the right image within the search range by the following equation. WC ( x, y, w w x y 2 [ f L( x+ i, y+ j f R ( x+ i+, y + j ], i= j= w w x y = fl( x+ i, y+ j f R ( x+ i+, y+ j, i= j= w w x y + + + + + f L( x i, y j fr ( x i, y j i= j= w w x y w w x y 2 2 f L ( x+ i, y + j f R ( x+ i+, y + i= j= i= j= j, for for for SSD SAD Correlatio n where f L (x, y an f R (x, y are the intensities of the pixel at a position (x, y the left an right images, respectively, W C (x, y, is the winow cost of a pixel at position (x, y in the left image with isparity, w x an w y are the winow with an height, respectively. The pixel in the right image that gives the best winow cost, i.e., the minimum SSD or SAD value or the maximum correlation value inicates the corresponing pixel of the pixel in the left image. The computation time for isparity estimation of a pixel epens on image size an isparity search range. In irect search, it requires to compute the winow costs for all caniate pixels within the search range, - max to + max. The conventional isparity estimation algorithm using irect search is escribe bellow: For each pixel (x, y in the left image, For ' =- max to + max o Calculate W c (x,y, '. En For Fin best W c (x,y, W c (x,y, '. Disparity of (x,y=. (2

22 The International Arab Journal of Information Technology, Vol. 0, No. 3, May 203 A fast technique for isparity estimation has been implemente by excluing unlikely corresponences. This pruning mechanism is base on the stereo matching constraint that the corresponing pixels shoul be close in color or intensity. To etermine the corresponence of a pixel in the left image we just compute the winow cost for caniate pixels in the right image within the search range if their intensity ifferences with respect to the pixel of the left image are less than a threshol value, δ. The propose pruning algorithm inclues the following steps: For each pixel (x, y in the left image,. For ' =- max to + max o.. If f L (x,y-f(x+ ',y <threshol then Calculate W C (x,y, '. 2. Fin best W C (x,y, Є W C (x,y, '. 3. Disparity of (x,y=. Estimating the corresponing points, we can compute the isparity map. The isparity map then can be converte to a 3D map of the scene (i.e., recover the 3D structure if the stereo geometry is known. 6. Reconstruction of 3D Scene The estimate stereo isparity fiels can be converte into ense epth information by camera geometry. As a result, the 3D moel of the real scene is reconstructe from the original images an isparity information. To reconstruct the 3D image we have to compute the z coorinate of each point in the left camera frame. For a 3D point (x, y, z we take x an y from the first stereo image an z is compute from the gray level intensity at (x, y of the epth map. Figure 5 illustrates the stereo imaging which involves in obtaining two separate image views of an object point P. The istance between the centers of the two camera lenses b is calle the base line. Our objective is to fin the coorinates (x p,y p,z p of the point P having image points (x L,y L an (x R,y R in the left an right camera frame, respectively. Let s recover the position of P from its projections P L an P R : y p z p bf = x L x R b.xl x p = xl xr b.yl b.yr = = (5 x x x x L R x L -x R = is the isparity (i.e., the istance between the corresponing points in the two images when the images are superimpose. Therefore, b.xl xp =, L b.yl y p =, R bf (3 (4 z p = (6 Using equation 6, we can easily obtain the 3D coorinates of the point-clou of the object, which will help to reconstruct the 3D scene. z y x Figure 5. Moel of the stereo imaging technique. 7. Results an Discussions z f (x L,y L We have reconstructe the 3D scene from the test stereo images with a fast an robust stereo corresponence matching technique. The effectiveness of the stereo corresponence algorithm has been justifie for ifferent types of images of ifferent resolutions with simple an complex backgroun. In orer to emonstrate the effectiveness of the algorithm, we present the processing results from synthetic an real image pairs, incluing ones with groun-truth values for quantitative comparison with other methos. Experiments were carrie out on a Pentium IV 2.GHz PC with 52MB RAM. The algorithm has been implemente using Visual C++. The ense isparity map is obtaine using some stanar stereo images (Tsukuba Hea which ense groun truth is known. Figure 6 illustrates the gray scale stereo images incluing the groun truth, an estimate isparity image. The groun truth image is histogram equalize for visualization purpose. The reconstructe 3D scene is shown in Figure 7. Disparities of the left image are estimate by SSD metho without an with consiering a threshol value for ifferent winow sizes. Table summarizes the obtaine isparity estimation results for the Tsukuba Hea image using SSD metho. The accuracies represent the percentage of correct isparities (i.e., same value as that of groun truth. It is also foun that a threshol value of δ=30 is a goo choice on the basis of a trae-off between accuracy an computation time. Figure 8 illustrates the runtime snapshot of isparity estimation results. The computational time versus winow size graph is shown in Figure 9 which reveals that the computational cost increases with the size of the winow. Therefore, a winow of size 3 3 performs better results than any other winow size, although x p x L P L X R P R b-(x L -x R b x p -b (x R,y R x p -x L P(x p,y p,z p z p -f x

Fast Winow Base Stereo Matching for 3D Scene Reconstruction 23 some variable winow size has been use for reucing the computational cost for better isparity estimation. Table. Disparity estimation accuracy (in % for ifferent threshol values. Threshol Value (δ δ=0 δ=20 δ=25 δ=30 Winow Size Correct Matching Accuracy (% 7 7 67.0 72.3 7 7 65.9 70. 7 7 66. 70.5 7 7 66. 3 7.9 Reuction of Computation Time (% Table 2 shows the comparison of our propose metho with some other methos [, 2, 3, 4, 5]. A winow of size 7 7 size is applie for all methos. Experimental results preict that our propose metho provies a reuction of 45% of computational time an increase of 20% accuracy of correct matching. 0 44 42 39 Table 2. Computational time reuction (% an increase in correct matching with propose pruning metho. Disparity Estimation Metho Conventional Winow Base Metho [-3] Aaptive Winow Metho [4] Average Winow Metho [5] Winow Size Computation Time Reuction(% Correct Matching Accuracy Increase (% 7 7 0 0 7 7 2 3 7 7 33 7 Propose Metho 7 7 45 20 Table 3. Parameters use in 3D reconstruction. Parameter Values Winow Size 3 3 Threshol level δ=30 Search range for isparity estimation K=2 Focal length of the cameras F=35mm Base istance B=5cm a Left image. b Right image. Figure 8. Runtime snapshot of isparity estimation accuracy (in % measurements. c Groun truth image. Estimate isparity image. Figure 6. A stanar stereo image pair (Tsukuba Hea with groun truth. We use a simple metho for the reconstruction of the 3D scene of the test image pairs from the estimate 3D coorinates an original image information. The focal length of the stereo cameras use in this simulation is 35mm, the baseline istance is 5cm an pixel size is 0.65mm. Table 3 shows the parameters use in the reconstruction process. Figure 7. Reconstructe 3D image. Computation Cost (ms 20 00 80 60 40 20 0 Figure 9. Computational cost vs. winow size. 8. Conclusions 3x3 5x5 7x7 9x9 x 3x3 5x5 Winow Size A 3D reconstruction technique from stereo image pair is propose. We have reconstructe 3D information of real images from the estimate isparity fiels an camera parameters. A fast stereo vision system has been evelope that analyzes grayscale or color images to estimate the isparity map for 3D scene reconstruction. To eal with aitive noise, a robust an novel approach capable of fast estimation of stereo corresponence is propose. A fuzzy rule-base filtering is employe for noise elimination which can remove the non interesting points from the stereo images. A total of about 20 images are use to investigate the effectiveness of the propose algorithms. Among them only a few points in the

24 The International Arab Journal of Information Technology, Vol. 0, No. 3, May 203 images are foun false. Experimental results emonstrate that the success rate of more than 95% is achieve. The main reason behin the failure of those images in fining isparity image is the mismatch of epipolar geometry. The main target of this research is to buil a robust an fast stereo vision system for realtime applications. Our next plan is to evelop a complete 3D moel from multi-view images. References [] Barnar S., Stereo Matching by Hierarchical, Microcanonical Annealing, in Proceeings of the International Joint Conference on Artificial Intelligence, Italy, pp. 832-835, 987. [2] Barnar S. an Fishler M., Stereo Vision, in Proceeings of Encyclopeia of Artificial Intelligence, New York, pp. 083-090, 987. [3] Calin G. an Roa V., Real-Time Disparity Map Extraction: A Dual Hea Stereo Vision System, Latin American Applie Research, vol. 37, no., pp. 2-24 2007. [4] Chowhury M., Monal A., an Bhuiyan M., 3D Imaging using Stereo Vision, in Proceeings of the 7 th International Conference on Computer an Information Technology, pp. 307-32, 2004. [5] Chowhury M. an Bhuiyan M., A New Approach for Disparity Map Determination, Daffoil International University Journal of Science an Technology, vol. 4, no., pp. 9-3, 2009. [6] Fua P., A Parallel Stereo Algorithm that Prouces Dense Depth Maps an Preserves Image Features, Machine Vision an Applications, vol. 6, no., pp. 35-49, 993. [7] Fusiello A., Trucco E., an Roberto V., Efficient Stereo with Multiple Winowing, in Proceeings of the IEEE Conference on Computer Vision an Pattern Recognition, USA, pp. 858-863, 997. [8] Hannah M., A System for Digital Stereo Image Matching, Photogrammetric Engineering an Remote Sensing, vol. 55, no. 2, pp. 765-770, 989. [9] Hoff W. an Ahuja N., Surface From Stereo: Integrating Feature Matching, Disparity Estimation an Contour Detection, IEEE Transaction on Pattern Analysis an Machine Intelligence, vol., no. 2, pp. 2-36, 989. [0] Hu T. an Huang M., A New Stereo Matching Algorithm for Binocular Vision, International Journal of Hybri Information Technology, vol. 3, no., pp. -8, 200. [] Intille S. an Bobick A., Disparity-Space Images an Large Occlusion Stereo, in Proceeings of the European Conference on Computer Vision, Sween, pp. 79-86, 994. [2] Kanae T. an Okutomi M., A Stereo Matching Algorithm with an Aaptive Winow: Theory an Experiment, IEEE Transaction on Pattern Analysis an Machine Intelligence, vol. 6, no. 9, pp. 920-932, 994. [3] Lmaati E., Oirrak A., Kaioui M., Ouahman A., an Sagal M., 3D Moel Retrieval Base on 3D Discrete Cosine Transform, The International Arab Journal of Information Technology, vol. 7, no. 3, pp. 264-270, 200. [4] Matonang M. an Haron H., Hybri Computing Algorithm in Representing Soli Moel, The International Arab Journal of Information Technology, vol. 7, no. 4, pp. 4-49, 200. [5] Muhlmann K., Maier D., Hesser J., an Manner R., Calculating Dense Disparity Maps from Color Stereo Images an Efficient Implementation, in Proceeings of the IEEE Conference on Computer Vision an Pattern Recognition, Kauai, pp. 30-36, 200. [6] Uin M., Chowhury M., Shioyama T., an Monal M., Fast Winow Base Approach for Stereo Matching, Journal of Science, vol. 27, pp. 45-54, 2004. [7] Veksler O., Stereo Matching by Compact Winows Via Minimum Ratio Cycle, in Proceeings of the IEEE International Conference on Computer Vision, Vancouver, pp. 540-547, 200. Mohamma Mozammel Chowhury is working as an Associate Professor in the Department of Computer Science an Engineering, Jahangirnagar University, Savar, Dhaka, Banglaesh. He receive his BSc egree in Electronics an Computer Science an MS egree in Computer Science an Engineering from the same university. He has publishe more than 25 articles in repute journals an conference proceeings. His research interest inclues: image processing, computer vision, robotics, 3 scene reconstruction, computer graphics an machine intelligence. Mohamma AL-Amin Bhuiyan is a Professor in the Department of Computer Science an Engineering, Jahangirnagar University, Savar, Dhaka, Banglaesh. He got the PhD egree in Electrical engineering from Osaka City University, Japan, he i the post octoral research from National Institute of Informatics, Japan an University of Hull, UK. His research interest inclues: robotics, image processing, computer vision, face recognition, human-robot symbiosis.