Proceedings of the 6th Int. Conf. on Computer Analysis of Images and Patterns. Direct Obstacle Detection and Motion. from Spatio-Temporal Derivatives

Similar documents
Hand-Eye Calibration from Image Derivatives

A Robust Two Feature Points Based Depth Estimation Method 1)

Department of Electrical Engineering, Keio University Hiyoshi Kouhoku-ku Yokohama 223, Japan

3D Motion from Image Derivatives Using the Least Trimmed Square Regression

Dominant plane detection using optical flow and Independent Component Analysis

extracted occurring from the spatial and temporal changes in an image sequence. An image sequence

first order approx. u+d second order approx. (S)

Center for Automation Research, University of Maryland. The independence measure is the residual normal

arxiv: v1 [cs.cv] 28 Sep 2018

Hand-Eye Calibration from Image Derivatives

Stereo and Epipolar geometry

A robust and convergent iterative approach for determining the dominant plane from two views without correspondence and calibration

Local qualitative shape from stereo. without detailed correspondence. Extended Abstract. Shimon Edelman. Internet:

Transactions on Information and Communications Technologies vol 19, 1997 WIT Press, ISSN

Proc. Int. Symp. Robotics, Mechatronics and Manufacturing Systems 92 pp , Kobe, Japan, September 1992

ROBUST LINE-BASED CALIBRATION OF LENS DISTORTION FROM A SINGLE VIEW

Real Time Biologically-Inspired Depth Maps from Spherical Flow

DEPTH AND GEOMETRY FROM A SINGLE 2D IMAGE USING TRIANGULATION

MOTION. Feature Matching/Tracking. Control Signal Generation REFERENCE IMAGE

Perceptual Grouping from Motion Cues Using Tensor Voting

1998 IEEE International Conference on Intelligent Vehicles 587

Optical Flow-Based Motion Estimation. Thanks to Steve Seitz, Simon Baker, Takeo Kanade, and anyone else who helped develop these slides.

Tracking Multiple Objects in 3D. Coimbra 3030 Coimbra target projection in the same image position (usually

Feature Transfer and Matching in Disparate Stereo Views through the use of Plane Homographies

Homographies and RANSAC

A NEW FEATURE BASED IMAGE REGISTRATION ALGORITHM INTRODUCTION

Comparison of Temporal Filters for Optical Flow Estimation in Continuous Mobile Robot Navigation

PRELIMINARY RESULTS ON REAL-TIME 3D FEATURE-BASED TRACKER 1. We present some preliminary results on a system for tracking 3D motion using

Optical Flow-Based Person Tracking by Multiple Cameras

A Summary of Projective Geometry

Camera Parameters Estimation from Hand-labelled Sun Sositions in Image Sequences

British Machine Vision Conference 2 The established approach for automatic model construction begins by taking surface measurements from a number of v

LOCAL-GLOBAL OPTICAL FLOW FOR IMAGE REGISTRATION

Transactions on Information and Communications Technologies vol 16, 1996 WIT Press, ISSN

3D object recognition used by team robotto

Structure from Motion. Prof. Marco Marcon

Calibrating a Structured Light System Dr Alan M. McIvor Robert J. Valkenburg Machine Vision Team, Industrial Research Limited P.O. Box 2225, Auckland

COMPARATIVE STUDY OF DIFFERENT APPROACHES FOR EFFICIENT RECTIFICATION UNDER GENERAL MOTION

A Quantitative Comparison of 4 Algorithms for Recovering Dense Accurate Depth

Marcel Worring Intelligent Sensory Information Systems

On-line and Off-line 3D Reconstruction for Crisis Management Applications

Stereoscopic Tracking of Bodies in Motion

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

Dynamic visual attention: competitive versus motion priority scheme

Lucas-Kanade Motion Estimation. Thanks to Steve Seitz, Simon Baker, Takeo Kanade, and anyone else who helped develop these slides.

Comparison Between The Optical Flow Computational Techniques

Computing Slow Optical Flow By Interpolated Quadratic Surface Matching

Introduction to Computer Vision

Combining Appearance and Topology for Wide

Vision-based Manipulator Navigation. using Mixtures of RBF Neural Networks. Wolfram Blase, Josef Pauli, and Jorg Bruske

Image Features: Local Descriptors. Sanja Fidler CSC420: Intro to Image Understanding 1/ 58

Interpretation of Urban Surface Models using 2D Building Information Norbert Haala and Claus Brenner Institut fur Photogrammetrie Universitat Stuttgar

Experiments with Edge Detection using One-dimensional Surface Fitting

CH2605-4/88/0000/0082$ IEEE DETERMINATION OF CAMERA LOCATION FROM 2D TO 3D LINE AND POINT CORRESPONDENCES

Real-time target tracking using a Pan and Tilt platform

Fast and Reliable Two-View Translation Estimation

Mixture Models and EM

Fast Outlier Rejection by Using Parallax-Based Rigidity Constraint for Epipolar Geometry Estimation

Stereo-Based Obstacle Avoidance in Indoor Environments with Active Sensor Re-Calibration

CHAPTER 5 MOTION DETECTION AND ANALYSIS

An Improved Evolutionary Algorithm for Fundamental Matrix Estimation

3D Environment Reconstruction

Local Image Registration: An Adaptive Filtering Framework

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.

To appear in ECCV-94, Stockholm, Sweden, May 2-6, 1994.

arxiv: v1 [cs.cv] 18 Sep 2017

Binocular Stereo Vision. System 6 Introduction Is there a Wedge in this 3D scene?

Passing Vehicle Detection from Dynamic Background Using Robust Information Fusion

Compositing a bird's eye view mosaic

METRIC PLANE RECTIFICATION USING SYMMETRIC VANISHING POINTS

An Event-based Optical Flow Algorithm for Dynamic Vision Sensors

Machine learning based automatic extrinsic calibration of an onboard monocular camera for driving assistance applications on smart mobile devices

Prof. Fanny Ficuciello Robotics for Bioengineering Visual Servoing

Estimating Camera Position And Posture by Using Feature Landmark Database

Recognition. Clark F. Olson. Cornell University. work on separate feature sets can be performed in

Adaptive Zoom Distance Measuring System of Camera Based on the Ranging of Binocular Vision

Segmentation and Tracking of Partial Planar Templates

Can Lucas-Kanade be used to estimate motion parallax in 3D cluttered scenes?

Tracking of Human Body using Multiple Predictors

Robotics Programming Laboratory

Time-to-Contact from Image Intensity

Target Tracking Using Mean-Shift And Affine Structure

Vision par ordinateur

An Automatic Method for Adjustment of a Camera Calibration Room

All human beings desire to know. [...] sight, more than any other senses, gives us knowledge of things and clarifies many differences among them.

Motion in 2D image sequences

Autonomous Mobile Robot Design

Multiple Plane Segmentation Using Optical Flow

Subpixel Corner Detection Using Spatial Moment 1)

Using temporal seeding to constrain the disparity search range in stereo matching

A Stochastic Environment Modeling Method for Mobile Robot by using 2-D Laser scanner Young D. Kwon,Jin.S Lee Department of Electrical Engineering, Poh

Background Initialization with A New Robust Statistical Approach

Real Time Biologically-Inspired Depth Maps from Spherical Flow

Robot localization method based on visual features and their geometric relationship

of human activities. Our research is motivated by considerations of a ground-based mobile surveillance system that monitors an extended area for

Absolute Scale Structure from Motion Using a Refractive Plate

VISION-BASED HANDLING WITH A MOBILE ROBOT

Improving Latent Fingerprint Matching Performance by Orientation Field Estimation using Localized Dictionaries

2 The MiníMax Principle First consider a simple problem. This problem will address the tradeos involved in a two-objective optimiation problem, where

Arm coordinate system. View 1. View 1 View 2. View 2 R, T R, T R, T R, T. 12 t 1. u_ 1 u_ 2. Coordinate system of a robot

Transcription:

Proceedings of the 6th Int. Conf. on Computer Analysis of Images and Patterns CAIP'95, pp. 874-879, Prague, Czech Republic, Sep 1995 Direct Obstacle Detection and Motion from Spatio-Temporal Derivatives Par Fornland Computational Vision and Active Perception Laboratory (CVAP) Department of Numerical Analysis and Computing Science Royal Institute of Technology, S-100 44 Stockholm, Sweden Abstract. Autonomous vehicles need a means of detecting obstructions on its path, to avoid collision. In this paper, a novel approach to obstacle detection is presented. A camera moves on a visible ground plane with the optical axis parallel to the ground. Camera motion parameters are linearly related to rst order spatio-temporal derivatives of the taken image sequence; image ow is not needed. Motion is robustly estimated using RANSAC. An error measure for each image point corresponds to the likelihood of an obstacle in that point. 1 Introduction Visual systems must detect when there is a risk of colliding with obstructions. Many such situations can be conceptually reduced to obstacles protruding above a at ground; the task of nding these protrusions is referred to as obstacle detection, for which various schemes using a moving monocular camera have been presented in the literature. The obstacle detection is performed by studying image motion, which dier between regions viewing obstacles and ground plane. In [3] a calibrated visual system was used, assuming translational motion. A reference ow calculated from the calibration information was compared to the estimated ow. But one may not know the calibration parameters, therefore all algorithms cannot rely on this assumption. It was indeed suggested [7] to nd the reference ow by moving the camera over a ground plane without obstacles. The motion and geometry are then assumed constant during the obstacle detection. Approaches not requiring a reference ow have been suggested, e.g. examining the divergence [6] and other qualities [8] of the ow. Both detected obstacles without specic assumptions about the camera motion or the scene structure, but both required dense ow estimates. Flow estimation is not yet robust [1] despite considerable eorts by the computer vision community. In [2], a representation of a translational motion and the equation of a plane were continuously updated, using long sequences. The estimated parameters enabled a prediction of image intensities of points viewing the plane. Obstacles cause bad predictions. In this paper is presented a direct framework for obstacle detection using rst-order spatio-temporal derivatives of an image sequence. World structure is neither required nor reconstructed from the images. Similar to [2], a short circuit

is introduced into the previously suggested procedures for obstacle detection, since no optical ow is required. Care has been taken not to restrict the motion to pure translation; instead a general framework is developed. The optical axis is parallel to the ground, which is an often avoided case. From the gradient constraint equation [1] and the second order ow equation [5], together with geometry assumptions, a linear equation relating motion parameters to image derivatives is derived. A here presented version of RANSAC [4] robustly estimates the motion from the over-constrained system of equations by disregarding a number of observations as outliers. Each observation is dened by coordinates and spatio-temporal derivatives of one point in the image plane. Two approaches are suggested, one globally estimating motion from all available observations, forming a histogram from the error measure. The other approach locally estimates the motion parameters in a number of small windows in the image. A multi-dimensional histogram is formed from the estimated parameters. In both cases, a peak is normally found at the correct motion, corresponding to ground plane points. This segments the image into obstacles and ground plane. 2 Theories and Algorithms Assume that a camera is positioned above a ground plane, as in Fig. 1 in which the X-axis is orthogonal to the paper. A 3D point P = (X; Y; Z) projects on the image at p = (fx=z; fy=z) = (x; y). If length measures are expressed in terms of the focal length, we can set f = 1. An image point may be projected from a point P on the ground plane or an obstacle Q. Also projected 3D velocities are dierent for obstacles [3] than for ground plane points. Y f Y0 (x,y) Z Q Fig. 1. The camera, the ground plane, and an obstacle P A camera moving through a static 3D world is equivalent to a rigid world moving in front of a static camera, a view which is used in this discussion. A constant rigid 3D motion can be resolved into translational (V X ; V Y ; V Z ) and rotational ( X ; Y ; Z ) components, the latter with angular velocities around an axis through the origin. This forms an ane transformation: 0 1 0 1 0 1 _ P = @ _X _Y A = @ Z _ V X V Y V Z Y Z? Z Y A + @ Z X? X Z A (1) X Y? Y X

Projecting P _ on the image leads to the second order [5] image ow, characterizing the motion of image structures with respect to three dimensional motion:!! _x _y = Z?1 (V X? xv Z )? xy X + (1 + x 2 ) Y? y Z Z?1 (V Y? yv Z )? (1 + y 2 ) X + xy Y + x Z Frequently in motion analysis, the camera is tilted as in Fig. 1 in order to avoid image points projected from close to the horizon, causing high and dicult spatial frequencies. In this research the tilt angle is zero, and the diculties must therefore be handled. Ground plane points satisfy the relation Z = Y 0 =y, and since the translational velocities only can be retrieved up to a scale factor, no restriction is imposed by scaling them with 1=Y 0, forming translational parameters C X, C Y and C Z. Figure 2 shows how the ground plane image ow depends on single motion parameters of _x = _y y 0? xy 0 y? y 2 0 @ C X C Y C Z 1 A +?xy 1 + x 2?y?(1 + y 2 ) xy x (2) 0 1 X @ Y A (3) Z Fig. 2. The V X, V Z and Y components respectively 2.1 Motion Estimation A common assumption in computer vision research is that the intensity of any moving image structure, I(x(t); y(t); t), is constant over time, di=dt = 0. The dierential theory chain rule gives di=dt = ri _p + I t where _p = ( _x; _y) t is the image ow. The resulting equation is ri _p + I t = 0 (4) which is referred to as the gradient constraint equation (GCE). This equation is the foundation of a wide variety of techniques [1] for estimating image ow. The ow function (3) is now inserted into the GCE (4), resulting in yi x C X + yi y C Y? (xyi x + y 2 I y )C Z? (I y + xyi x + y 2 I y ) X + +(I x + x 2 I x + xyi y ) Y + (xi y? yi x ) Z +I t = 0 (5) where the coecients are the camera motion parameters, and the observations are taken from the image. The assumption that the optical axis and the X- axis are always parallel to the ground plane is violated by X- and Z-rotations, respectively. A Y -translation violates the assumption about constant height. These arguments lead to the constraints C Y = X = Z = 0.

2.2 The RANSAC method The RANSAC method [4] has been applied to computer vision before, but not to this problem. Given a set of observations of a parametrised function, it nds a subset of observations tting the function, disregarding other observations. A simple version ts a line to a set of points (x; y). Figure 3 (a) exemplies a dicult line-tting with synthetic data, where RANSAC performs well. The dashed line is estimated using all observations in a least squares error method. y Freq Freq 0 0 (a) x (b) Error (c) Error Fig. 3. (a) RANSAC (b) Error histogram for ground and (c) ground with obstacle A generalisation uses M noisy observations x i 2 R N of a hyperplane A x = A 1 x 1 + : : : + A N x N = 1, where A is a normal to the hyperplane, and x are coordinates. First, an integer index vector i = (i1 : : : i N ) is chosen randomly from 1 i k M, dening a preliminary hyperplane through? xi1 : : : x in t A =? 1 : : : 1 The signed orthogonal distance between a point x and a hyperplane is d =? x A? 1 = A, i.e. points on opposite sides of the plane have dierent signs. The set of observation points lying within a predened distance from the hyperplane is remembered; remaining points are disregarded as outliers. The procedure is repeated n times, each time for dierent i. The nal hyperplane is estimated with a Least Squares t of the largest remembered set. (6) 2.3 Obstacle Detection The obstacles are detected in the following manner. The RANSAC method rst globally estimates the motion parameter hyperplane. For each observation derived from the ground plane, the signed distance d (see above) is found to have an approximately Gaussian distribution N(0; ). But points viewing obstacles correspond to observation variables with dierent statistics. The RANSAC method is robust against outliers, in this case corresponding to obstacles. Histogramming techniques can be applied to provide thresholds, indirectly segmenting the image into ground and obstacles. The threshold will be delicate to estimate if the obstacles are small, as their histogram peaks then might be below the noise level. But a large obstacle corresponds to a discernible peak in the histogram. The largest allowed obstacle size, where the method starts breaking down, is investigated in a coming, longer version of this paper.

The motion estimation can also be performed separately in small windows of the image. For windows containing only the ground plane, the motion parameters are expected to be well estimated. But for windows viewing obstacles, the estimated motion will be erroneous. Multi-dimensional histograms are formed from the set of estimated motion parameters of the image windows, and a prominent peak should normally be found for the correct motion. Histogram bins in a neighborhood of the peak correspond to image windows viewing the ground plane, and the other bins correspond to potential obstacles. 3 Experiments Experiments were performed both for the motion estimation and the obstacle detection. Three consecutive spatially smoothed images were used. The rst experiments conrm that direct estimation of the camera motion is possible. Synthetic images with only two motion parameters, and the geometry described in Section 2 were used. Inspired by [2], only the lower part of the image was considered. The randomness of RANSAC causes a small noise, see Table 1. Ground truth Estimates C Z Y C Z Y 0.02-0.1 0.0196-0.0931-0.03 0-0.0301-0.0109 0-0.1-0.0002-0.1027 0.03 0.2 0.0295 0.1989 Table 1. Motion estimation experiments without obstacles Histograms of the distance from each observation point to the estimated motion parameter hyperplane were examined. Synthetic images were produced, and as indicated by Fig. 3 (b), the shape of the histogram formed from a ground plane resembles a Gaussian distribution. Figure 3 (c) shows a histogram formed from a mix of a ground plane and a constant height obstacle. Real images were used to evaluate the obstacle detection. The exact motion parameters were unknown, but the motion was approximately translational in the (X; Z)-plane. The images view a box on a ground plane. As real images are noisy due to motion blur, sensor noise etc, and the box only have small motion parallax close to the ground, the segmentation is not expected to be very robust. The stronger local window strategy was therefore employed. The resulting binary image indicating obstacles is eroded to remove smaller regions. Thresholds and RANSAC-parameters are scene-dependent, and therefore selected accordingly. The obstacles are marked in Fig. 4, where the top of the box is not detected since only the lower part of the image is used, where the ground plane is visible. The detection works well for obstacle parts suciently high above the ground.

Fig. 4. Box scenes with the detected obstacles Acknowledgements: The author wishes to thank Dr. Bergholm for providing valuable experience in motion estimation. References 1. J.L. Barron, D.J. Fleet, and S.S. Beauchemin, \Performance of optical ow techniques", International Journal of Computer Vision, vol. 12, no. 1, pp. 43{77, 1994. 2. S. Carlsson and J-O Eklundh, \Object detection using model based prediction and motion parallax", in Proceedings of the First European Conference on Computer Vision, pp. 134{138, Springer-Verlag, Apr. 1990. (Antibes, France). 3. W. Enkelmann, \Obstacle detection by evaluation of optical ow elds from image sequences", Image and Vision Computing, vol. 9, no. 3, pp. 160{168, 1991. 4. M.A. Fischler and R.C. Bolles, \Random sample consensus: A paradigm for model tting with applications to image analysis and automated cartography", Commun. ACM, vol. 24, pp. 381{395, 1981. 5. H.C. Lounguet-Higgins and K. Prazdny, \The interpretation of a moving retinal image", in Proc. Royal Society London B-208, pp. 385{397, 1980. 6. R. C. Nelson and Y. Aloimonos, \Obstacle avoidance using ow eld divergence", IEEE Trans. on PAMI, vol. 11, pp. 1102{1106, 1989. 7. M. Tistarelli and G. Sandini, \Dynamic aspects in active vision", CVGIP: Image Understanding, vol. 56, pp. 108{129, July 1992. 8. G-S Young, T-H Hong, M. Herman, and J.C.S. Yang, \Safe navigation for autonomous vehicles: A purposive and direct solution", in SPIE Int. Conf. on Intelligent Robots and Computer Vision XII, pp. 31{42, 1993.