A Robust Two Feature Points Based Depth Estimation Method 1)

Similar documents
Flexible Calibration of a Portable Structured Light System through Surface Plane

Subpixel Corner Detection Using Spatial Moment 1)

Local Image Registration: An Adaptive Filtering Framework

Absolute Scale Structure from Motion Using a Refractive Plate

An Image Based 3D Reconstruction System for Large Indoor Scenes

Chapter 3 Image Registration. Chapter 3 Image Registration

Application of Geometry Rectification to Deformed Characters Recognition Liqun Wang1, a * and Honghui Fan2

CIS 580, Machine Perception, Spring 2015 Homework 1 Due: :59AM

Machine vision. Summary # 11: Stereo vision and epipolar geometry. u l = λx. v l = λy

Proceedings of the 6th Int. Conf. on Computer Analysis of Images and Patterns. Direct Obstacle Detection and Motion. from Spatio-Temporal Derivatives

EXAM SOLUTIONS. Image Processing and Computer Vision Course 2D1421 Monday, 13 th of March 2006,

A Novel Image Super-resolution Reconstruction Algorithm based on Modified Sparse Representation

Planar pattern for automatic camera calibration

arxiv: v1 [cs.cv] 28 Sep 2018

Video Alignment. Literature Survey. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin

Video Alignment. Final Report. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin

A Novel Stereo Camera System by a Biprism

Camera Calibration using Vanishing Points

An Improved Optical Flow Method for Image Registration with Large-scale Movements

Transactions on Information and Communications Technologies vol 16, 1996 WIT Press, ISSN

Compositing a bird's eye view mosaic

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth

Practical Camera Auto-Calibration Based on Object Appearance and Motion for Traffic Scene Visual Surveillance

COMPARATIVE STUDY OF DIFFERENT APPROACHES FOR EFFICIENT RECTIFICATION UNDER GENERAL MOTION

An Angle Estimation to Landmarks for Autonomous Satellite Navigation

Particle Tracking. For Bulk Material Handling Systems Using DEM Models. By: Jordan Pease

Accurate Motion Estimation and High-Precision 3D Reconstruction by Sensor Fusion

Adaptive Zoom Distance Measuring System of Camera Based on the Ranging of Binocular Vision

Coplanar circles, quasi-affine invariance and calibration

Camera Model and Calibration

A Practical Camera Calibration System on Mobile Phones

Synchronized Ego-Motion Recovery of Two Face-to-Face Cameras

EE 264: Image Processing and Reconstruction. Image Motion Estimation I. EE 264: Image Processing and Reconstruction. Outline

ECE 470: Homework 5. Due Tuesday, October 27 in Seth Hutchinson. Luke A. Wendt

ELEC Dr Reji Mathew Electrical Engineering UNSW

MOTION. Feature Matching/Tracking. Control Signal Generation REFERENCE IMAGE

Automatic Feature Extraction of Pose-measuring System Based on Geometric Invariants

Measurement of Pedestrian Groups Using Subtraction Stereo

Robust color segmentation algorithms in illumination variation conditions

Camera Calibration from the Quasi-affine Invariance of Two Parallel Circles

Integrating 3D Vision Measurements into Industrial Robot Applications

calibrated coordinates Linear transformation pixel coordinates

Motion. 1 Introduction. 2 Optical Flow. Sohaib A Khan. 2.1 Brightness Constancy Equation

Midterm Exam Solutions

Determining pose of a human face from a single monocular image

Indoor Positioning System Based on Distributed Camera Sensor Networks for Mobile Robot

A Feature Point Matching Based Approach for Video Objects Segmentation

3D Environment Measurement Using Binocular Stereo and Motion Stereo by Mobile Robot with Omnidirectional Stereo Camera

Motion Tracking and Event Understanding in Video Sequences

Proc. Int. Symp. Robotics, Mechatronics and Manufacturing Systems 92 pp , Kobe, Japan, September 1992

A COMPREHENSIVE SIMULATION SOFTWARE FOR TEACHING CAMERA CALIBRATION

MOTION STEREO DOUBLE MATCHING RESTRICTION IN 3D MOVEMENT ANALYSIS

International Conference on Information Sciences, Machinery, Materials and Energy (ICISMME 2015)

Stereo and Epipolar geometry

Camera Pose Measurement from 2D-3D Correspondences of Three Z Shaped Lines

High Accuracy Depth Measurement using Multi-view Stereo

CS5670: Computer Vision

University of Southern California, 1590 the Alameda #200 Los Angeles, CA San Jose, CA Abstract

Announcements. Stereo

CS231A Midterm Review. Friday 5/6/2016

CSE 252B: Computer Vision II

Camera Model and Calibration. Lecture-12

Optical Flow-Based Person Tracking by Multiple Cameras

Guang-Hui Wang et al.: Single View Based Measurement on Space Planes 375 parallelism of lines and planes. In [1, ], a keypoint-based approach to calcu

Perception and Action using Multilinear Forms

Time-to-Contact from Image Intensity

Robust Video Super-Resolution with Registration Efficiency Adaptation

3D Motion from Image Derivatives Using the Least Trimmed Square Regression

Segmentation and Tracking of Partial Planar Templates

METRIC PLANE RECTIFICATION USING SYMMETRIC VANISHING POINTS

3D Sensing and Reconstruction Readings: Ch 12: , Ch 13: ,

CHAPTER 3 DISPARITY AND DEPTH MAP COMPUTATION

Introduction to Computer Vision

Stereo Image Rectification for Simple Panoramic Image Generation

3D Sensing. 3D Shape from X. Perspective Geometry. Camera Model. Camera Calibration. General Stereo Triangulation.

EE795: Computer Vision and Intelligent Systems

Monocular SLAM for a Small-Size Humanoid Robot

Image Transformations & Camera Calibration. Mašinska vizija, 2018.

3D Digitization of Human Foot Based on Computer Stereo Vision Combined with KINECT Sensor Hai-Qing YANG a,*, Li HE b, Geng-Xin GUO c and Yong-Jun XU d

Feature Transfer and Matching in Disparate Stereo Views through the use of Plane Homographies

Chapters 1 7: Overview

Project Title: Welding Machine Monitoring System Phase II. Name of PI: Prof. Kenneth K.M. LAM (EIE) Progress / Achievement: (with photos, if any)

AAM Based Facial Feature Tracking with Kinect

EE795: Computer Vision and Intelligent Systems

Stereo-Based Obstacle Avoidance in Indoor Environments with Active Sensor Re-Calibration

Calibration of an Optical-Acoustic Sensor

3D FACE RECONSTRUCTION BASED ON EPIPOLAR GEOMETRY

Research on Clearance of Aerial Remote Sensing Images Based on Image Fusion

Stereo Vision. MAN-522 Computer Vision

MERGING POINT CLOUDS FROM MULTIPLE KINECTS. Nishant Rai 13th July, 2016 CARIS Lab University of British Columbia

Ruch (Motion) Rozpoznawanie Obrazów Krzysztof Krawiec Instytut Informatyki, Politechnika Poznańska. Krzysztof Krawiec IDSS

Abstract. Keywords. Computer Vision, Geometric and Morphologic Analysis, Stereo Vision, 3D and Range Data Analysis.

Feature-Based Image Mosaicing

Rectification and Distortion Correction

Self-calibration of a pair of stereo cameras in general position

DEPTH AND GEOMETRY FROM A SINGLE 2D IMAGE USING TRIANGULATION

Announcements. Stereo

LEARNING NAVIGATION MAPS BY LOOKING AT PEOPLE

A linear algorithm for Camera Self-Calibration, Motion and Structure Recovery for Multi-Planar Scenes from Two Perspective Images

Detection of Small-Waving Hand by Distributed Camera System

Transcription:

Vol.31, No.5 ACTA AUTOMATICA SINICA September, 2005 A Robust Two Feature Points Based Depth Estimation Method 1) ZHONG Zhi-Guang YI Jian-Qiang ZHAO Dong-Bin (Laboratory of Complex Systems and Intelligence Sciences, Institute of Automation, Chinese Academy of Sciences, Beijing 100080) (E-mail: {zhiguang.zhong, jianqiang.yi, dongbin.zhao}@mail.ia.ac.cn) Abstract This paper presents a novel depth estimation method based on feature points. Two points are selected arbitrarily from an object and their distance in the space is assumed to be known. The proposed technique can estimate simultaneously their depths according to two images taken before and after a camera moves and the motion parameters of the camera may be unknown. In addition, this paper analyzes the ways to enhance the precision of the estimated depths and presents a feature point image coordinates search algorithm to increase the robustness of the proposed method. The search algorithm can find automatically more accurate image coordinates of the feature points based on their detected image coordinates. Experimental results demonstrate the efficiency of the presented method. Key words Depth estimation, image sequence, motion vision, mobile robot navigation 1 Introduction The focus of this paper is to discuss the depth from motion problem, which has many important applications such as mobile robot navigation. Reported methods for this problem mainly are optical-flow based, gradient-based and feature-based. Among them, the optical-flow based method [1,2] is very sensitive to errors in optical flow computation since the estimation of optical flow is an ill-posed problem solvable only by introduction of additional constraints (e.g., smoothness). The gradient-based method [3,4], also called direct method, is difficult to recover the scene depth in the vicinity of focus-ofexpansion [5]. Further, reported direct method generally is based on the brightness change constraint equation using least-squares method as an estimator, which is sensitive to the residual errors that are not Gaussian distributed [4]. The above two groups of methods generally require image motion be small, usually less than several pixels or even one pixel [6]. The feature-based method has been extensively studied because it requires fewer constraints on a motion model or scene depth. Selected features can be points, lines, curves, etc. Only feature points based method is considered in this paper. The number of selected feature points can be two or more. For example, Or [7] estimated depths according to a set of feature points whose three-dimensional coordinates in the space are known before the camera moves. Oberkampf [8] presented a method for depth estimation using four or more coplanar feature points. Fujii [9] assumed that a camera moves along its focal axis and there are two feature points on its object with the same depths in the camera coordinate system, then estimated their depths according to their distance difference in two consecutive images. Huber [10] introduced a similar technique. However, instead of assuming, it directly selects two feature points from an object with the same depth in the camera coordinate system. This paper proposes a novel depth estimation method based on two feature points. Different from Fujii [9] and Huber [10], the proposed method selects feature points arbitrarily. It does not require a camera move along its focal axis. In addition, it only needs the distance in the space between the two feature points is known while the camera motion can be unknown. According to two consecutive images taken when a camera approaches its object in an arbitrary direction, the presented technique can estimate simultaneously the depths of the two feature points before and after the camera moves. Further, since detected image coordinates of the feature points may not be very accurate due to many factors such as image noise, this paper also analyzes the ways to enhance the estimation precision and presents a feature point image coordinates search algorithm to increase the robustness of this 1) Supported by the National 973 Plan (2003CB517106) and National Natural Science Foundation of P. R. China (60475030) Received January 6, 2004; in revised form July 6, 2005 Copyright c 2005 by Editorial Office of Acta Automatica Sinica. All rights reserved.

694 ACTA AUTOMATICA SINICA Vol.31 technique. The search algorithm can find automatically more accurate image coordinates of the feature points based on their detected image coordinates. 2 Depth estimation As shown in Fig. 1, points A and B are the selected feature points. l is their distance in the space. Points C 1 and C 2 represent the projection centers of a camera before and after it moves. It is assumed that the camera can rotate before and after it moves while it only translates during the motion. l a1 etc. denote the length of line segment C 1A etc. Fig. 1 Depth estimation principle Assume that the intrinsic parameters of the camera and the image coordinates of the feature points are known. Then according to Fig. 1, we can get l a1m a1 l a2m a2 = l b1 m b1 l b2 m b2 (1) where m ai and m bi (i = 1,2) are the normalized vectors of A and B on the two consecutive image planes, respectively. As we have assumed above, they are all known. According to (1), after some dot product and cross product operations, the following equation can be obtained: l a2(m a2, m a1, m b1 ) = l b2 (m b2, m a1, m b1 ) (2) Here, ( ) denotes scalar triple product. If (m a2, m a1, m b1 ) is not zero, that is, m a2, m a1 and m b1 are not coplanar, it is easy to get Since l a2/l b2 = (m b2, m a1, m b1 )/(m a2, m a1, m b1 ) = µ 2 (3) l 2 = l 2 a2 + b 2 b2 2l a2l b2 (m a2, m b2 ) (4) where (m a2, m b2 ) represents dot product, the above two equations lead to { la2 = µ 2l b2 l b2 = l/ 1 + µ 2 2 2µ2(ma2, m b2) (5) Similarly, we can get where { la1 = µ 1l b1 l b1 = l/ 1 + µ 2 1 2µ1(ma1, m b1) (m b1, m a2, m b2 )/(m a1, m a2, m b2 ) = µ 1 (7) Therefore, the depths of feature points A and B in the original and consecutive camera coordinate systems can be computed as { da1 = l aim ai(3), i = 1, 2 (8) d bi = l bi m bi (3) (6)

No.5 ZHONG Zhi-Guang et al.: A Robust Two Feature Points Based Depth Estimation Method 695 where m ai(3) and m bi (3) represent the third element (i.e., Z element) of vectors m ai and m bi. If (m a2, m a1, m b1 ) is zero, then points C 1, C 2, A and B are coplanar. For this case, the depth estimation principle is similar and not discussed in this paper. From the above depth estimation process, one can see that the proposed method is simple and visualized. It needs less computation time and is applicable to real-time application. However, the accuracy of the estimated depths depends heavily on that of the detected image coordinates. To enhance the precision of the estimated results and the robustness of the proposed technique, it is needed to optimize the computed depths according to different given conditions. For example, if n feature points can be selected arbitrarily from an object and their distances in the space are all known, then to enhance the depth estimation precision of a feature point P, we can compute its n 1 estimated depths according to (8) from the n 1 feature point pairs constituted by P and the other n 1 feature points, and then obtain the optimum estimation depth of P by various optimization algorithms such as calculating their mean value. In the following section, a search algorithm is presented that requires the selected two feature points have the same height from the ground plane. It can enhance the accuracy of the detected image coordinates and then increase the depth estimation precision. 3 Search algorithm For the convenience of discussion, we take mobile robot navigation for example to introduce this search algorithm. Assume a mobile robot is moving to an object, on which there are two feature points with the same height from the floor and their distance in the space is known. As shown in Fig. 2, points A and B are the two feature points with the same height. Points O c1 and O c2 are the projection centers of the camera before and after the robot moves. O r1x r1y r1z r1 is the robot coordinate system and also represents the ground plane containing A and B. Its Z r1 axis is parallel to the translation direction of the robot. Points O r1 and O r2 are the orthogonal projections of points O c1 and O c2, respectively. h is the altitude difference between the feature points and the projection center of the camera. It may be zero or not zero and only the latter case is discussed in this paper. l is the length of line segment AB. r is the displacement of the robot. Fig. 2 Geometric relationship for the search algorithm Obviously, the above depth estimation process is also applicable to this situation. In addition, since the two feature points A and B are at the same height from the floor, we can obtain the constraint equations, i.e., O r1a = [0, h, 0] + l a1(r m a1) O r1b = [0, h,0] + l b1 (R m b1 ) (9) O r2a = [0, h, 0] + l a2(r m a2) O r2b = [0, h,0] + l b2 (R m b2 ) Here, R is the rotational matrix of the robot coordinate system relative to the camera coordinate system and can be computed according to the pan and tilt angles of the camera.

696 ACTA AUTOMATICA SINICA Vol.31 These constraint equations determine the geometric relationships, which the image coordinates of the two feature points in the two consecutive images must satisfy. Therefore, if the detected image coordinates are not very accurate, it is possible to find their more accurate image coordinates based on their detected ones and then obtain more accurate depth estimation results. Assume that the detected image coordinates in the two consecutive images are (u a1, v a1), (u b1, v b1 ), (u a2, v a2) and (u b2, v b2 ), respectively. We first keep u a1, u b1, u a2 and u b2 constant and search the N pixels neighborhoods of v a1, v b1, v a2 and v b2. The specific process is explained as follows: 1) for each quadruple [v a1, v b1, v a2, v b2 ] constituted by the coordinates within the neighborhoods of v a1, v b1, v a2 and v b2, O r1b, O r2a and O r2b can be computed from (9). If O r1b(2) O r1a(2) (the difference between the Y elements of vectors O r1b and O r1a) and O r2b(2) O r2a(2) are both less than the threshold value v, then this quadruple is considered as relatively accurate image coordinates. The reason is if [v a1, v b1, v a2, v b2 ] are correct, then computed O r1b, O r2a and O r2b should lie in the ground plane containing points A and B, that is, their Y r1 elements should be zero. Such quadruple is recorded until the four neighborhoods are completely searched. 2) If the number of the above satisfying quadruples is more than 1, then we compute V = v b1 v a1 + v b2 v a2 for each recorded quadruple. The one that makes V the least is considered as more accurate image coordinates. The reason is that generally h is greatly less than the distance between the feature points and the camera. 3) If the quadruples making V the least are more than one, then the one that makes O r1a(2) and O r2a(2) the least is selected as the optimum image coordinates. The reason is the same as that in (1). It is noted that more accurate image coordinates may not be found when this search step is over. The main cause is the detected image coordinates deviate heavily from their actual values. Then, we can expand the search range N or increase the threshold value v. Obviously, the former leads to more computation time while the latter results in a lower estimation precision. Therefore, it is needed to trade off the two ways in applications. After the above search step, the depths of the feature points can generally be estimated accurately. To obtain better estimation results, we keep the found optimum quadruple [v a1, v b1, v a2, v b2 ] constant and search the M (M is the set value of the search range) pixels neighborhoods of u a1, u b1, u a2 and u b2. For each quadruple [u a1, u b1, u a2, u b2 ] constituted by the coordinates within the neighborhoods of u a1, u b1, u a2 and u b2, O r1b, O r2a and O r2b can be computed from (9). If O r1a(2) and O r2a(2) have opposite signs, then this quadruple is considered relatively inaccurate. Otherwise, we compute U = O r1a(2) + O r2a(2) + O r1b(2) O r1a(2) + O r2b(2) O r2b(2) O r2a(2). The quadruple [u a1, u b1, u a2, u b2 ] that makes U the least is considered as the most accurate image coordinates. The two constraint conditions are to make the second elements (i.e., Y elements) of O r1b, O r2a and O r2b as the same as possible. Since the presented depth estimation method needs only two feature points, the constraint conditions that can be used for the proposed search algorithm is very limited. Therefore, the search results are not always the best. However, they always get much better than before. 4 Experiments and analysis Some experiments are performed on real images to test the efficiency of the proposed depth estimation method and the search algorithm. To measure conveniently and accurately the actual depths of feature points, a calibration reference is selected as an object and a camera is forced to translate on a desk with 4 4 grids. Fig. 3 is a typical image for these experiments. Points E and F, G and H, A and B are 3 pairs of feature points with the same height from the floor. Their distances l in space are all 16.0 cm while the height differences h are 13.1cm, 5.1 cm and 0.0 cm (this case is not discussed in this paper), respectively. A part of experimental results are listed in Table 1, Table 2 and Table 3. All these results are obtained from the above two steps search algorithm. Since the focus of this paper is not image processing, the image coordinates of the feature points are detected by hand. The search algorithm is operated in the 1 pixel neighborhoods of the detected image coordinates. The threshold value v is set to 0.0005. In all these tables, (d 1, d 2) (unit: cm)

No.5 ZHONG Zhi-Guang et al.: A Robust Two Feature Points Based Depth Estimation Method 697 represent the depths of the first feature point before and after the camera moves. The estimation results of the second feature point are similar to those of the first feature point and they are not listed here. The data in Table 1 and Table 2 are obtained when E and F are selected as feature points. The depth estimation results of the first feature point are shown in Table 1 when the tilt angle is not zero before the camera moves and the camera translates 4cm each time along the Z-axis direction of the object coordinate system (i.e., the world coordinate system). The estimation results shown in the first two lines of Table 2 are obtained when the pan angle is 45 degrees while the tilt angle is 8 degrees or 12 degrees and the camera translates simultaneously each time 4cm along the X-axis and Z-axis. The estimated depths listed in the last two lines of Table 2 are obtained when the pan angle is 63.4 degrees while the tilt angles is 8 degrees or 12 degrees and the camera translates each time 4cm and 2cm along the X-axis and Z-axis, respectively. The estimated depths listed in Table 3 are obtained when both the pan angle and the tilt angle are zero while height difference h varies and the camera translates 4cm each time along the Z-axis. When h is equal to 5.1cm, points G and H are selected as feature points. If l varies, then the depth estimation results are similar to those obtained when h varies. Therefore, they are not listed here. Fig. 3 A typical image for real experiment Table 1 Estimated depths when the tilt angle varies Tilt angle (d 1, d 2) 4.0 (89.0, 91.1) (93.2, 95.5) (97.9, 100.4) (103.0, 104.4) (107.2, 108.7) 8.0 (90.0, 91.6) (93.8, 96.1) (98.5, 100.4) (103.0, 105.1) (107.9, 109.4) 12.0 (90.0, 92.2) (94.4, 96.1) (98.5, 101.0) (103.7, 105.1) (107.9, 108.7) 16.0 (90.6, 92.2) (94.4, 96.1) (98.5, 100.4) (103.0, 104.4) (107.9, 108.7) Actual values (88.6, 92.6) (92.6, 96.6) (96.6, 100.6) (100.6, 104.6) (104.6, 108.6) Table 2 Estimated depths when the pan angle and the tilt angle vary Pan angle Tilt angle (d 1, d 2) 45.0 8.0 (89.5, 92.7) (93.2, 96.7) (98.5, 102.3) (103.0, 107.2) 45.0 12.0 (89.0, 92.7) (93.8, 97.3) (99.1, 102.3) (102.3, 105.1) Actual values (88.6, 92.6) (92.6, 96.6) (96.6, 100.6) (100.6, 104.6) 63.4 8.0 (89.5, 90.6) (91.6, 93.2) (93.8, 96.7) (97.3, 99.1) 63.4 12.0 (89.5, 91.1) (92.2, 93.8) (94.9, 96.1) (96.7, 97.9) Actual values (88.6, 90.6) (90.6, 92.6) (92.6, 94.6) (94.6, 96.6) Table 3 Estimated depths when varies h (d 1, d 2) 13.1 (88.0, 92.2) (92.2, 96.1) (96.7, 100.4) (101.0, 103.7) (105.1, 107.9) 5.1 (88.5, 91.1) (91.6, 96.7) (97.3, 99.1) (101.7, 103.7) (105.7, 107.9) Actual values (88.6, 92.6) (92.6, 96.6) (96.6, 100.6) (100.6, 104.6) (104.6, 108.6)

698 ACTA AUTOMATICA SINICA Vol.31 According to Table 1, Table 2, and Table 3, it is clear that the estimated depths obtained from the proposed method have fairly high accuracy. Under different experimental conditions, the relative errors of most estimation results are less than 2%. The sources of the estimation errors mainly are measurement errors and image distortion, which is not rectified in these experiments. If the search ranges are expanded or the threshold value is increased, then the estimation precision can be further enhanced. 5 Conclusions This paper proposes a novel depth estimation method based on two feature points and two images. It can estimate simultaneously the depths of two feature points before and after a camera moves and the camera motion can be unknown. In addition, the presented search algorithm can find automatically more accurate image coordinates of the feature points based on their detected ones. Experimental results indicate that it can enhance greatly the precision of the estimation results. The proposed approach requires less computation time and is very applicable to real-time applications such as mobile robot navigation. References 1 Toshiharu M, Noboru O. Motion and structure from perspectively projected optical flow by solving linear simultaneous equations. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robot and Systems, Picataway, NJ, USA: IEEE, 1997. 740 745 2 Tagawa N, Inagaki A, Minagawa A. Parametric estimation of optical flow from two perspective views. IEICE Transanction on Information and Systems, 2001, E84-D(4): 485 494 3 Hung Y S, Ho H T. A Kalman filter approach to direct depth estimation incorporating surface structure. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1999, 21(6): 570 575 4 Park S K, Kweon I S. Robust and direct estimation of 3-D motion and scene depth from stereo image sequences. Pattern Recognition, 2001, 34(9): 1713 1728 5 Hanna K J, Okamoto N E. Combining stereo and motion analysis for direct estimation of scene structure. In: Proceedings of IEEE 4th International Conference on Computer Vision, Los Alamitos, CA, USA: IEEE, 1993. 357 365 6 Bergen J R, Anandan P, Hanna K J, Hingorani J. Hierarchical model-based motion estimation. In: Proceedings of European Conference on Computer Vision, London, UK: Spring-Verlag, 1992. 237 252 7 Or S H, Luk W S, Wong K H, etc. An effective iterative pose estimation algorithm. Image and Vision Computing, 1998, 16(5): 353 362 8 Oberkampf D, Dementhon D F, Davis L S. Iterative pose estimation using coplanar feature points. Computer Vision and Image Understanding, 1998, 63(3): 495 511 9 Fujii Y, Wehe D K, Weymouth T E. Robust monocular depth perception using feature pairs and approximate motion. In: Proceedings of the IEEE International Conference on Robotics and Automation, Picataway, NJ, USA: IEEE, 1992. 33 39 10 Huber J, Graefe V. Motion stereo for mobile robots. IEEE Transactions on Industrial Electronics, 1994, 41(4): 378 383 ZHONG Zhi-Guang Received his bachelor and master degrees from Hunan University in 1998 and Institute of Automation, Chinese Academy of Sciences in 2002, respectively. Now he is a Ph. D. candidate at Institute of Automation, Chinese Academy of Sciences. His research interests include intelligent control, mobile robot, and computer vision. YI Jian-Qiang Received his Ph. D. degree in 1992 from Kyushu Institute of Technology, Japan, and currently he is a professor at Institute of Automation, Chinese Academy of Sciences. His research interests include intelligent control, robotics, and mechatronics. ZHAO Dong-Bin Received his Ph. D. degree from Harbin University of Technology in 2000. He is now an associate professor at Institute of Automation, Chinese Academy of Sciences. His research interests include intelligent control, robotics, and mechatronics.