Active Motion Detection and Object Tracking. Joachim Denzler and Dietrich W.R.Paulus.

Similar documents
A Two Stage Real{Time Object Tracking System. Universitat Erlangen{Nurnberg. Martensstr. 3, D{91058 Erlangen, Germany

Real{Time Motion Tracking. Joachim Denzler and Heinrich Niemann.

Tracking Points in Sequences of Color Images

Learning, Tracking and Recognition of 3D Objects. The following was printed in the proceedings IROS 94. Munchen. Sept. 1994

Ingo Scholz, Joachim Denzler, Heinrich Niemann Calibration of Real Scenes for the Reconstruction of Dynamic Light Fields

Detecting Elliptic Objects Using Inverse. Hough{Transform. Joachim Hornegger, Dietrich W. R. Paulus. The following paper will appear in the

Kevin Skadron. 18 April Abstract. higher rate of failure requires eective fault-tolerance. Asynchronous consistent checkpointing oers a

Color Normalization and Object Localization

A Statistical Consistency Check for the Space Carving Algorithm.

Depth-Layer-Based Patient Motion Compensation for the Overlay of 3D Volumes onto X-Ray Sequences

Design Specication. Group 3

Technische Universitat Munchen. Institut fur Informatik. D Munchen.

FutBotIII: Towards a Robust Centralized Vision System for RoboCup Small League

A Real Time System for Detecting and Tracking People. Ismail Haritaoglu, David Harwood and Larry S. Davis. University of Maryland

Normal mode acoustic propagation models. E.A. Vavalis. the computer code to a network of heterogeneous workstations using the Parallel

PRELIMINARY RESULTS ON REAL-TIME 3D FEATURE-BASED TRACKER 1. We present some preliminary results on a system for tracking 3D motion using

Aircraft Tracking Based on KLT Feature Tracker and Image Modeling

Edge-Preserving Denoising for Segmentation in CT-Images

PROJECTION MODELING SIMPLIFICATION MARKER EXTRACTION DECISION. Image #k Partition #k

(Preliminary Version 2 ) Jai-Hoon Kim Nitin H. Vaidya. Department of Computer Science. Texas A&M University. College Station, TX

VISION-BASED HANDLING WITH A MOBILE ROBOT

Factorization with Missing and Noisy Data

Department of Electrical Engineering, Keio University Hiyoshi Kouhoku-ku Yokohama 223, Japan

3. International Conference on Face and Gesture Recognition, April 14-16, 1998, Nara, Japan 1. A Real Time System for Detecting and Tracking People

Real-time target tracking using a Pan and Tilt platform

A Hierarchical Approach to Workload. M. Calzarossa 1, G. Haring 2, G. Kotsis 2,A.Merlo 1,D.Tessera 1

2 Algorithm Description Active contours are initialized using the output of the SUSAN edge detector [10]. Edge runs that contain a reasonable number (

Simple and Robust Tracking of Hands and Objects for Video-based Multimedia Production

Local qualitative shape from stereo. without detailed correspondence. Extended Abstract. Shimon Edelman. Internet:

n m-dimensional data points K Clusters KP Data Points (Cluster centers) K Clusters

100 Mbps DEC FDDI Gigaswitch

Phase2. Phase 1. Video Sequence. Frame Intensities. 1 Bi-ME Bi-ME Bi-ME. Motion Vectors. temporal training. Snake Images. Boundary Smoothing

Simultaneous surface texture classification and illumination tilt angle prediction

2 The original active contour algorithm presented in [] had some inherent computational problems in evaluating the energy function, which were subsequ

of human activities. Our research is motivated by considerations of a ground-based mobile surveillance system that monitors an extended area for

Dietrich Paulus Joachim Hornegger. Pattern Recognition of Images and Speech in C++

Vision-based Manipulator Navigation. using Mixtures of RBF Neural Networks. Wolfram Blase, Josef Pauli, and Jorg Bruske

Matthias Zobel, Joachim Denzler, Heinrich Niemann Binocular 3-D Object Tracking with Varying Focal Lengths

FPGA Implementation of a Single Pass Real-Time Blob Analysis Using Run Length Encoding

CS 664 Segmentation. Daniel Huttenlocher

A MOTION MODEL BASED VIDEO STABILISATION ALGORITHM

instruction fetch memory interface signal unit priority manager instruction decode stack register sets address PC2 PC3 PC4 instructions extern signals

Bitangent 3. Bitangent 1. dist = max Region A. Region B. Bitangent 2. Bitangent 4

Figure 1: Representation of moving images using layers Once a set of ane models has been found, similar models are grouped based in a mean-square dist

Robot Vision without Calibration

AN ADAPTIVE MESH METHOD FOR OBJECT TRACKING

FOR IMAGE ANALYSIS. Dietrich W. R. Paulus and Heinrich Niemann. Universitat Erlangen{Nurnberg. Lehrstuhl fur Mustererkennung (Informatik 5)

Face Detection CUDA Accelerating

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

output vector controller y s... y 2 y 1

Subpixel Corner Detection Using Spatial Moment 1)

Snakes reparameterization for noisy images segmentation and targets tracking

Spatio-Temporal Stereo Disparity Integration

On-road obstacle detection system for driver assistance

The Analysis of Animate Object Motion using Neural Networks and Snakes

Attentive Face Detection and Recognition? Volker Kruger, Udo Mahlmeister, and Gerald Sommer. University of Kiel, Germany,

LINUX. Benchmark problems have been calculated with dierent cluster con- gurations. The results obtained from these experiments are compared to those

Object Tracking Algorithm based on Combination of Edge and Color Information

[8] J. J. Dongarra and D. C. Sorensen. SCHEDULE: Programs. In D. B. Gannon L. H. Jamieson {24, August 1988.

Object Detection in Video Streams

Ein lernfähiges Vision-System mit KNN in der Medizintechnik

Calibrating a Structured Light System Dr Alan M. McIvor Robert J. Valkenburg Machine Vision Team, Industrial Research Limited P.O. Box 2225, Auckland

Wide-Angle Lens Distortion Correction Using Division Models

Task analysis based on observing hands and objects by vision

Tracking of Human Body using Multiple Predictors

Drywall state detection in image data for automatic indoor progress monitoring C. Kropp, C. Koch and M. König

Applying Synthetic Images to Learning Grasping Orientation from Single Monocular Images

Tracking Multiple Objects in 3D. Coimbra 3030 Coimbra target projection in the same image position (usually

A Robust and Efficient Motion Segmentation Based on Orthogonal Projection Matrix of Shape Space

Proc. Int. Symp. Robotics, Mechatronics and Manufacturing Systems 92 pp , Kobe, Japan, September 1992

Color Image Segmentation Editor Based on the Integration of Edge-Linking, Region Labeling and Deformable Model

Image Feature. Contour Tracker

Khoral Research, Inc. Khoros is a powerful, integrated system which allows users to perform a variety

Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos

A Robust Algorithm for Segmenting Deformable Linear Objects from Video Image Sequences

Saliency Extraction for Gaze-Contingent Displays

AN EFFICIENT BINARY CORNER DETECTOR. P. Saeedi, P. Lawrence and D. Lowe

Exploration of Unknown or Partially Known. Prof. Dr. -Ing. G. Farber. the current step. It can be derived from a third camera

One category of visual tracking. Computer Science SURJ. Michael Fischer

COMS W4735: Visual Interfaces To Computers. Final Project (Finger Mouse) Submitted by: Tarandeep Singh Uni: ts2379

Judging Whether Multiple Silhouettes Can Come from the Same Object

The Analysis of Animate Object Motion using Neural Networks and Snakes

sizes. Section 5 briey introduces some of the possible applications of the algorithm. Finally, we draw some conclusions in Section 6. 2 MasPar Archite

Qualitative Tracking of 3-D Objects using Active Contour Networks*

Video Alignment. Literature Survey. Spring 2005 Prof. Brian Evans Multidimensional Digital Signal Processing Project The University of Texas at Austin

Robust vein extraction on plant leaf images

Fusion of 3D B-spline surface patches reconstructed from image sequences

Discrete Estimation of Data Completeness for 3D Scan Trajectories with Detector Offset

ATRANSPUTER BASED LASER SCANNING SYSTEM

AN APPROACH OF SEMIAUTOMATED ROAD EXTRACTION FROM AERIAL IMAGE BASED ON TEMPLATE MATCHING AND NEURAL NETWORK

A COMPARISON OF WAVELET-BASED AND RIDGELET- BASED TEXTURE CLASSIFICATION OF TISSUES IN COMPUTED TOMOGRAPHY

Learning the Three Factors of a Non-overlapping Multi-camera Network Topology

Render. Predicted Pose. Pose Estimator. Feature Position. Match Templates

Video Inter-frame Forgery Identification Based on Optical Flow Consistency

Rowena Cole and Luigi Barone. Department of Computer Science, The University of Western Australia, Western Australia, 6907

Understanding People Pointing: The Perseus System. Roger E. Kahn and Michael J. Swain. University of Chicago East 58th Street.

Snakes, level sets and graphcuts. (Deformable models)

Non-rigid body Object Tracking using Fuzzy Neural System based on Multiple ROIs and Adaptive Motion Frame Method

thresholded flow field structure element filled result

Using Game Theory for Image Segmentation

Transcription:

0 Active Motion Detection and Object Tracking Joachim Denzler and Dietrich W.R.Paulus denzler,paulus@informatik.uni-erlangen.de The following paper was published in the Proceedings on the 1 st International Conference on Image Processing Austin, Texas, November 13 th ; 16 th,1994 Volume 3

ACTIVE MOTION DETECTION AND OBJECT TRACKING Joachim Denzler and Dietrich W. R. Paulus Universitat Erlangen{Nurnberg Lehrstuhl fur Mustererkennung (Informatik 5) Martensstr. 3, D{91058 Erlangen Tel.: +49{9131{85-7894, FAX: +49{9131{303811 email: fdenzler,paulusg@informatik.uni-erlangen.de ABSTRACT In this paper we describe a two stage active vision system for tracking of a moving object which is detected in an overview image of the scene a close{up view is then taken by changing the frame grabber's parameters and by a positional change of the camera mounted on a robot's hand. With a combination of several simple and fast working vision modules, a robust system for object tracking is constructed. The main principle is the use of two stages for object tracking: one for the detection of motion and one for the tracking itself. Errors in both stages can be detected in real time then, the system switches back from the tracking to the motion detection stage. Standard UNIX interprocess communication mechanism are used for the communication between control and vision modules. Object{oriented programming hides hardware details. 1. INTRODUCTION Real time image processing and analysis is essential for robot control based on visual information. Since today's general purpose computers still lack the required transmission rates for continuous image input and processing, usually special hardware is applied for image processing using dedicated software or rmware for these devices. Detection and tracking of moving objects in the scene remains one of the basic problems to be solved. Various progress reports on this subject may be found in the literature e. g. in [1, 11]. In this article we demonstrate how standard Unix workstations and a general software architecture can be utilized for real time motion detection and object tracking in a closed loop of sensor and action. A wide{angle view of the scene is recorded to detect the moving object the parameters of the frame grabbing device are changed to obtain a detail view in a higher resolution by achange of the camera position the detail view is kept in the center. The key to success is adaptivity and selectivity of processing in space and time and the combination of several simple and fast vision modules for subtasks needed for object tracking [2] this is commonly referred to as active vision. For motion detection we present a fast algorithm based on dierence images, obtained from a xed (stationary) camera position. In the second stage, active contour models are applied for object tracking [6, 7,11]. The snake is initialized automatically from the rst image using a real time method presented in [3]. Actually, there is no proof in the literature for the robustness of real time tracking based on snakes in an non{ synthetical environment. We present snake features which can be used for the detection of errors during the tracking. Thus tracking become more robust. Two computers in parallel perform the image processing and robot control tasks. All processes can run on a single machine, alternatively. In Sect. 2, we will give a short overview of the software and hardware architecture. Sect. 3 summarizes the principles of active vision. The two

stage object tracking will be described in Sect. 4, where we will present features, which have to be computed for detecting errors during the tracking. After an introduction of the ideas underlying the robot's control, we will show in Sect. 6 the robustness of our two stage approach during experiments in a quantitative manner. We conclude with an outlook on future extensions of the system. 2. SOFTWARE AND HARDWARE ARCHITECTURE In our approach the general modular software architecture for knowledge based pattern analysis [8] is extended to to the active vision paradigm. An object{oriented implementation encapsulates hardware features of the used computers (HP 735) and video equipment (RasterOps Videolive card and Sony CCD RGB cameras) also, exible segmentation is provided. Images are assumed to reside in main memory, i.e. not on the frame grabber device. Visual tasks and robot control are separated into two communicating processes. Communication is based on the pvm interprocess communication library [5]. Only few bytes have tobeex- changed per cycle. Since continuous image transfer from the frame grabber card to main memory uses up to 40% percent of the CPU power, experiments show better performance if control and vision are running in parallel on separate computers. 3. VISION MODULES IN ACTIVE VISION Swain [10] gives a summary of active vision methods for image processing. Amongst active changes in the image capturing devices, selectivity of the algorithms or the hardware is mentioned. In [2] the use of vision modules is specied. Various simple modules, each performing a single and dedicated task, communicate with each other. By integration of results from all vision modules, a more complicated vision task like object tracking may be executed. Several modules are important for object tracking in a closed loop of sensor an action: The motion detection module, the tracking module itself, the module for the robot control and an attention module for detection of certain events during tracking (see Figure 1). Control Robot Control Attention Tracking Grabber Figure 1: The principle of vision modules for object tracking To reduce the amount of data which must be processed, one principle of active vision is the reduction of data by selectivity in space, time and resolution [10]. In the tracking stage selectivity should be done in space in the attention module selectivity means looking at xed time intervals at the whole image to recognize some events, like the entry of new moving object this is selectivity in time. Finally, selectivity in resolution should be used by the motion detection stage. 4. TWO STAGE ACTIVE OBJECT TRACKING We will now give a detailed description of our approach for object tracking using a connection of four simple vision modules: motion detection, motion tracking, attention module, and robot control. At present, the rst three modules are implemented in one single process. The second process, the robot control, will be described in the next section. First one has to detect motion, i.e. the moving object in the scene we assume a static camera, i. e. we keep the position of the robot xed. A dierence image between neighbouring images is computed. After a threshold operation we get a region of interest (ROI). Small gaps are closed and then the contour of the ROI is computed and

passed to the second module, the object tracking module. For object tracking we use active contour models. In most other approaches the active contour is interactively initialized on the rst image of an image sequence of course, this is impracticable in closed loop processing. Therefore we assume, that the ROI covers the moving object and make additionally makeuseofaproperty of non-rigid snakes: In the case of low or zero external energy { that means, no edges are near the snake { the snake will collapse to one point, and therefore it is sucient to do a coarse initialization around the object's contour. The snake will collapse until it reaches the contour of the moving object. Now, we use the contour of the ROI as an initialization for the active contour. Experiments show that this approach is sucient for an automatic initialization of the snake. Errors may occur if there are strong background edges near the object, or if the ROI only partially covers the moving object. In combination with error detection by the attention module we get a robust initialization, which is proven by the results of the experiments described in Sect. 6. An overview of the system for object tracking is given in Figure 2. A complete description may be found in [3, 4]. Motion detection is done on a low resolution (128 128) of the complete camera image. After initialization of the snake onthe object's contour we switch to a higher resolution subimage containing the moving object. The single steps can be computed very fast because of their simplicity. On the other hand, simplicity may be the source of errors. For example, if the ROI does not contain the moving object, the snake cannot extract it or the snake mayloose the object during tracking. Thus, another important aspect is the detection of errors during object tracking. The attention module, or any other module in the vision system, should thus be able to decide, whether tracking of a moving object is no longer possible for example, if the object will be occluded by background objects, or the tracking module has lost the object. Then it has to stop tracking, send a message to the motion detection module (stage 1) and wait for a result to start tracking again. This can be seen by the two paths back from stage 2 to stage 1 in Figure 2. 1: Motion detection 2: Motion tracking Figure 2: Overview of the two stages of the object tracking. Left side: stage 1 (motion detection). Right side: stage 2 (motion tracking). Figure 3: Features for detection of errors during tracking: the x- and y-moment of the active contour. Typically, three main errors ocurr: The snake collapses to one point, if there is no edge near the snake [7], the snake extracts a static background object, or the xation of some snake elements on strong background edges near the object (see Figure 3). To detect such errors the following features are extracted from the active contour: correlation, regression, moments and motion of the the snake. In Figure 3 the plots of the x-moment and y-moment of the active contour are shown during a sequence of images. The reason for the rapid change of the x-andy-moments is that three snake elements extract the rail instead of the train itself for 10 images. A more detailed discussion of the feature based detection of errors may be found in [4].

Figure 4: Two methods used for changing the view on the scene: dashed lines by changing the parameters of the grabbing device solid lines by the motion of the robot. Only the grey area is digitized on the board. 5. CLOSED LOOP ROBOT CONTROL The fourth part of our closed loop object tracking system is the module for robot control. There exist many approaches in the literature for robot control [9]. Since in a closed loop system the slowest module constraints the overall system performance, robot control should be kept as simple as possible. Therefore we use only four signals for each direction, in which the robot can move: UP or DOWN and FASTER or SLOWER. To cause a smooth robot motion we use two dierent methods for changing the grabbed image. First, we move the robot itself. Second, we change the parameters of the grabbing device such, that the tracked object is kept always in the middle of the subimage (see Figure 4). The motion, indicated by the dashed line is done by changing the area, which the grabbing device will digitize. The motion of the whole image is indicated by the solid lines and is achieved by the robot's motion. 6. EXPERIMENTS Various experiments show the stability and speed of our system. Presently, tracking of a moving toy locomotive with an approximate speed of 2 cm/sec is possible in real time. The distance between the object and the camera is approximately 1.5 m. Figure 5 shows the subimage captured by the robot's camera (compare Figure 4) in the rst row. The corresponding snakes are printed below. The third row gives an overview of the whole experiment from a background camera. In order to give quantitative results we have tested and evaluated the system with ve test runs, which is equivalent to 3000 images and to a time period of 14 minutes. Three times the object was lost and the system had to switch back to the motion detection stage. 66 % of all initializations of the snake were sucient for the follwing tracking. Judging the quality of the tracking algorithm, we have measured the distance of the object's center of gravity (COG) from the middle of the image. We dene, that the object is in the center of the image, i.e. the active contour has accurately extracted the object, if the this distance is less than 20 pixels. Using this criterion, in 83 % the object is in the middle of the image. Presently, no prediction is done for the position of the object in the next image. Using a prediction step the performance of the system should be improved [11]. 7. CONCLUSION The motion detection and tracking module is allowed to make mistakes, because the detection of such mistakes is possible in real time and so the error can be xed. Thus the system operates fault tolerant. A simple and fast initialization of the snakes can therefore be used. The results prove, that the combination of simple vision modules results in a very fast and robust system. In future work, we will extend the tracking stage by a prediction module. Robustness of the attention module and an extension of the automatic initialization are subject to further improvements. Furthermore, the robot control has to be extended to allow the investigation of the robot's environment. This will result in an active detection of moving objects by a closed loop saccade control. 8. REFERENCES [1] P.K. Allen, A. Timcenko, B. Yoshimi, and P. Michelman. Trajectory ltering and prediction for automated tracking and grasping of a moving object. In Image Understanding Workshop, pages 1019{1033, San Diego, 1993.

Figure 5: Images 13, 84, 207, 329, 424 of one experiment. First row: The subimage containing the tracked object. Second row: the corresponding snakes. Third row: overview of the whole experiment. [2] J. Aloimonos and D. Shulman. Integration of Visual Modules. Academic Press, Boston, San Diego, New York, 1989. [3] J. Denzler, R. Be J. Hornegger, H. Niemann, and D. Paulus. Learning, tracking and recognition of 3D objects. In V. Graefe, editor, International Conference on Intelligent Robots and Systems { Advanced Robotic Systems and Real World, to appear September 1994. [4] J. Denzler and H. Niemann. A two-stage real time object tracking system. In German- Slowenia Workshop on Image Processing, to appear 1994. [5] A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Manchek, and V. Sunderam. PVM 3.0 user's guide and reference manual. Technical Report ORNL/TM-12187, Engeneering Physics and Mathemantics Division, Oak Ridge National Laboratory, Tennessee, 1993. [6] M. Kass, A. Wittkin, and D. Terzopoulos. Snakes: Active contour models. International Journal of Computer Vision, 2(3):321{331, 1988. [7] F. Leymarie and M.D. Levine. Tracking deformable objects in the plane using an active contour model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(6):617{634, 1993. [8] H. Niemann. Pattern Analysis and Understanding. Springer, Berlin, 1990. [9] N.P. Papanikolopoulos, P.K. Khosla, and T. Kanade. Visual tracking of a moving target by a camera mounted on a robot: A combination of control and vision. IEEE Transactions on Robotics and Automation, 9(1):14{ 34, 1993. [10] M.J. Swain and M. Stricker. Promising directions in active vision. Technical Report CS 91-27, University of Chicago, 1991. [11] D. Terzopoulos and R. Szeliski. Tracking with kalman snakes. In A. Blake and A. Yuille, editors, Active Vision, pages 3{20. MIT Press, 1992.