Department of Electrical Engineering, Keio University Hiyoshi Kouhoku-ku Yokohama 223, Japan

Similar documents
Object Modeling from Multiple Images Using Genetic Algorithms. Hideo SAITO and Masayuki MORI. Department of Electrical Engineering, Keio University

Factorization Method Using Interpolated Feature Tracking via Projective Geometry

Proceedings of the 6th Int. Conf. on Computer Analysis of Images and Patterns. Direct Obstacle Detection and Motion. from Spatio-Temporal Derivatives

3D Corner Detection from Room Environment Using the Handy Video Camera

Recognizing Buildings in Urban Scene of Distant View ABSTRACT

VISION-BASED HANDLING WITH A MOBILE ROBOT

2 ATTILA FAZEKAS The tracking model of the robot car The schematic picture of the robot car can be seen on Fig.1. Figure 1. The main controlling task

Task Driven Perceptual Organization for Extraction of Rooftop Polygons. Christopher Jaynes, Frank Stolle and Robert Collins

Dense 3-D Reconstruction of an Outdoor Scene by Hundreds-baseline Stereo Using a Hand-held Video Camera

Segmentation and Tracking of Partial Planar Templates

where g(x; y) is the the extracted local features or called observation, s(x; y) is the surface process and represents a smooth surface which is const

Augmented Reality VU. Computer Vision 3D Registration (2) Prof. Vincent Lepetit

A Hierarchical Statistical Framework for the Segmentation of Deformable Objects in Image Sequences Charles Kervrann and Fabrice Heitz IRISA / INRIA -

Image-Based Memory of Environment. homing uses a similar idea that the agent memorizes. [Hong 91]. However, the agent nds diculties in arranging its

Transactions on Information and Communications Technologies vol 19, 1997 WIT Press, ISSN

Face Tracking : An implementation of the Kanade-Lucas-Tomasi Tracking algorithm

Visual Tracking (1) Feature Point Tracking and Block Matching

Frontier Pareto-optimum

Bitangent 3. Bitangent 1. dist = max Region A. Region B. Bitangent 2. Bitangent 4

Genetic Fourier Descriptor for the Detection of Rotational Symmetry

Structured Light II. Thanks to Ronen Gvili, Szymon Rusinkiewicz and Maks Ovsjanikov

thresholded flow field structure element filled result

Approach to Minimize Errors in Synthesized. Abstract. A new paradigm, the minimization of errors in synthesized images, is

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

Malaysian License Plate Recognition Artificial Neural Networks and Evolu Computation. The original publication is availabl

Optical Flow-Based Person Tracking by Multiple Cameras

Textureless Layers CMU-RI-TR Qifa Ke, Simon Baker, and Takeo Kanade

Measurement of Pedestrian Groups Using Subtraction Stereo

Grasp Recognition using a 3D Articulated Model and Infrared Images

Matching. Compare region of image to region of image. Today, simplest kind of matching. Intensities similar.

3D Computer Vision. Structured Light II. Prof. Didier Stricker. Kaiserlautern University.

Visual Tracking (1) Tracking of Feature Points and Planar Rigid Objects

Lecture 16: Computer Vision

TEXTURE OVERLAY ONTO NON-RIGID SURFACE USING COMMODITY DEPTH CAMERA

N. Hitschfeld. Blanco Encalada 2120, Santiago, CHILE.

Capturing, Modeling, Rendering 3D Structures

Genetic Algorithm Based Template Optimization for a Vision System: Obstacle Detection

British Machine Vision Conference 2 The established approach for automatic model construction begins by taking surface measurements from a number of v

An Implementation of Spatial Algorithm to Estimate the Focus Map from a Single Image

Exploration of Unknown or Partially Known. Prof. Dr. -Ing. G. Farber. the current step. It can be derived from a third camera

2 Algorithm Description Active contours are initialized using the output of the SUSAN edge detector [10]. Edge runs that contain a reasonable number (

Improvement of Accuracy for 2D Marker-Based Tracking Using Particle Filter

1998 IEEE International Conference on Intelligent Vehicles 587

Mapping textures on 3D geometric model using reflectance image

Using temporal seeding to constrain the disparity search range in stereo matching

Z (cm) Y (cm) X (cm)

Outdoor Scene Reconstruction from Multiple Image Sequences Captured by a Hand-held Video Camera

AN EFFICIENT BINARY CORNER DETECTOR. P. Saeedi, P. Lawrence and D. Lowe

Genetic Algorithms for Free-Form Surface Matching

Segmentation and Grouping

Subpixel Corner Detection Using Spatial Moment 1)

Scale Invariant Segment Detection and Tracking

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

Introduction to Computer Vision

CS664 Lecture #18: Motion

Motion Estimation. There are three main types (or applications) of motion estimation:

Recognition. Clark F. Olson. Cornell University. work on separate feature sets can be performed in

The ring of place cells in the rodent hippocampus is partly under the control

Lecture 16: Computer Vision

Real-time Generation and Presentation of View-dependent Binocular Stereo Images Using a Sequence of Omnidirectional Images

Depth. Common Classification Tasks. Example: AlexNet. Another Example: Inception. Another Example: Inception. Depth

(c) (b) Recognition Accuracy. Recognition Accuracy. Recognition Accuracy. PCA-matching Noise Rate σ (a) PCA-matching

Preprocessing of Stream Data using Attribute Selection based on Survival of the Fittest

X y. f(x,y,d) f(x,y,d) Peak. Motion stereo space. parameter space. (x,y,d) Motion stereo space. Parameter space. Motion stereo space.

1998 IEEE International Conference on Intelligent Vehicles 213

TEXTURE OVERLAY ONTO NON-RIGID SURFACE USING COMMODITY DEPTH CAMERA

Leow Wee Kheng CS4243 Computer Vision and Pattern Recognition. Motion Tracking. CS4243 Motion Tracking 1

Three-Dimensional Computer Vision

Stereo vision. Many slides adapted from Steve Seitz

3D Modeling of Objects Using Laser Scanning

Visual Tracking (1) Pixel-intensity-based methods

Shape Modeling of A String And Recognition Using Distance Sensor

Laboratory for Computational Intelligence Main Mall. The University of British Columbia. Canada. Abstract

336 THE STATISTICAL SOFTWARE NEWSLETTER where z is one (randomly taken) pole of the simplex S, g the centroid of the remaining d poles of the simplex

A New Algorithm for Shape Detection

Model Based Pose Estimation. from Uncertain Data. Thesis submitted for the degree \Doctor of Philosophy" Yacov Hel-Or

Task analysis based on observing hands and objects by vision

Specular 3D Object Tracking by View Generative Learning

Multi-view Stereo. Ivo Boyadzhiev CS7670: September 13, 2011

3D Environment Measurement Using Binocular Stereo and Motion Stereo by Mobile Robot with Omnidirectional Stereo Camera

Coarse-to-fine image registration

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

MOTION. Feature Matching/Tracking. Control Signal Generation REFERENCE IMAGE

In Proc of 4th Int'l Conf on Parallel Problem Solving from Nature New Crossover Methods for Sequencing Problems 1 Tolga Asveren and Paul Molito

Local qualitative shape from stereo. without detailed correspondence. Extended Abstract. Shimon Edelman. Internet:

n m-dimensional data points K Clusters KP Data Points (Cluster centers) K Clusters

June Wesley, Cambridge Univ. Press,

x L d +W/2 -W/2 W x R (d k+1/2, o k+1/2 ) d (x k, d k ) (x k-1, d k-1 ) (a) (b)

Experiments with Edge Detection using One-dimensional Surface Fitting

larly shaped triangles and is preferred over alternative triangulations as the geometric data structure in the split and merge scheme. The incremental

Ultrasonic Robot Eye for Shape Recognition Employing a Genetic Algorithm

A Robust Two Feature Points Based Depth Estimation Method 1)

Towards Cognitive Agents: Embodiment based Object Recognition for Vision-Based Mobile Agents Kazunori Terada 31y, Takayuki Nakamura 31, Hideaki Takeda

Classifier C-Net. 2D Projected Images of 3D Objects. 2D Projected Images of 3D Objects. Model I. Model II

A GA-based fuzzy adaptive learning control network

Simultaneous Registration of Multiple Range Views For Use. In Reverse Engineering of CAD Models 1. Robert B. Fisher

st chromosome. 2nd n-th ... 1': 1: 2: n: 2': n': part 1 part 2 part i part p. One part is randomly chosenn

Overview. Related Work Tensor Voting in 2-D Tensor Voting in 3-D Tensor Voting in N-D Application to Vision Problems Stereo Visual Motion

Hand-Eye Calibration from Image Derivatives

Motion. 1 Introduction. 2 Optical Flow. Sohaib A Khan. 2.1 Brightness Constancy Equation

Transcription:

Shape Modeling from Multiple View Images Using GAs Satoshi KIRIHARA and Hideo SAITO Department of Electrical Engineering, Keio University 3-14-1 Hiyoshi Kouhoku-ku Yokohama 223, Japan TEL +81-45-563-1141 (ext.3310), FAX +81-45-563-2773 E-mail: kiri@ozawa.elec.keio.ac.jp, saito@ozawa.elec.keio.ac.jp Abstract Shape modeling is a very important issue for many study, for example, object recognition for robot vision, virtual environment construction, and so on. In this paper, a new method of object modeling from multiple view images using genetic algorithms (GAs) is proposed. In this method, a similarity between model and every input image is calculated, and then the model which has the maximum similarity is found. For nding the model of maximum similarity, genetic algorithms are used as the optimization method. In the genetic algorithm, the sharing scheme is employed for ecient detection of multiple solution, because some shape may be represented by multiple shape models. Some results of modeling experiments from real multiple images demonstrate that the proposed method can robustly generate model by using the GA. 1 Introduction Many methods have recently been studied for 3-D object modeling. These methods can be applied to object recognition for robot vision [1], construction of virtual world [2][3], and so on. Especially, virtual world is signicant subject for communication using computer networks. Many previous modeling method use range data of the object shape which can generally be obtained by range nders [4] [5].However, range nders are not popular tool, and they can not obtain the surface texture data which can be used for synthesizing images presented in virtual space (image-based rendering). Therefore, many methods for recovering range information from 2-D images taken with CCD camera have been studied [6][7]. Generally, because these methods use some multiple images, some corresponding points between each image must be detected. If the corresponding points are detected exactly, the range information can be recovered by triangulation. However, the detection of corresponding points is known as a dicult problem and thus have been studied hardly. It is especially dicult to make correspondence between multi-view images because of the occlusion and the inconsistency of background scene. If we consider that the object problem is not estimating accurate range information but generating accurate object models, we don't have to recover the range data. For example, we can obtain the object model by the use of interactive operation system [8] in which generation of hypothesis of models and verication of the hypothesis on the multiple view images are repeated interactively. Such kind of modeling system does not require the range data of the object but verication of the generated hypothesis of models by the human. In this study, we intend to make automatic modeling system based on the repeated operation of generation and evaluation of hypothesis. For searching the best hypothesis of model eciently, we employ genetic algorithms (GAs) [9]. In our method, the model matching to every input images are found by applying GAs which repeat evaluation

of hypotheses of the models. For the evaluation, similarity between the model and the input image at each view point is calculated, and then the model having the maximum evaluation is found by GAs. In our previous article [10], the concept and the early results are presented, but the quality of the results are not sucient. In this paper, we have made signicant improvement for the algorithms and obtained high quality results, which are shown in the following parts of this paper. 2 Proposed method In this study, we assumed that the object is the polyhedral such as an articial building. Under this assumption, the object can be expressed by multiple model shapes of triangular prisms and rectangular prisms. Input images are taken by CCD camera from multiple view. The problem is estimation of shape, position, and pose of the models from the input images. Fig. 1 shows the scheme of the shape modeling assumed in this paper. Input Images Multiple Primitives Object Models Fig. 1. Scheme of the shape modeling. In this method, modeling is performed by maximizing an evaluation function using GAs. The evaluation function is described later. 2.1 Model parameters As described above, we employ the model shapes of triangular prisms and rectangular prisms for representing the object surface. We dene the model parameters as the following: 1. model category (rectangles or triangles) (A), 2. length to x, y, z axis (lx, ly, lz), 3. rotation angle around y axis (bt), 4. position of center of gravity (xc, yc, zc). 2.2 Modeling by GAs Fig. 2 shows the ow of the modeling process by GAs. First, a group of strings is produced at random. These strings represent hypothesizes of the model. All strings are evaluated, then the genetic operations are applied to them. The string which has the best evaluation is searched by repeating the genetic operation. The model parameters represented by the best strings is regarded as the nal solution. Denition of string For using GAs, the object model must be expressed as a string. As described in the previous section, the object model can be represented by some simple parameters in this method. Each parameters has 8 bits binary except for the parameter of model category. The parameter of model category has only 1 bit, because it is used for only distinguishing between rectangles and triangles. Then, the strings

consist of 57 bits binary. For corresponding the hamming distance of the string to the distance of the parameter value, Initialization of string Camera 1 Camera 2 Evaluation of string Input Images Sharing Elite? No Yes Blurred Gradient Images Reproduction Criterion 1 + Crossover Local Search Similarity Mutation No End? Yes Selection of Solutions Wire Frame Images Parameters String Fig. 2. The ow of the modeling process by GAs Fig. 3. Similarity criterion Evaluation function We dene evaluation function which represents similarity between input images and the model hypothesis. The following two criterion are used in this method. 1. Similarity between the input multiple view images and the synthetic wire frame images as shown in Fig. 3 (f1). 2. Consistency of texture patterns on the model plane which are given by multi-view images as shown in Fig. 4 (f2). First criterion (f 1)is dened as the following equation. f1 = X i P Bi(x; y)si(x; y) x;y (1) qpx;y qpx;y B2i (x; y) S 2i (x; y) where S i represents wire frame image of the model at ith view, and B i(x; y) is blurred gradient image of the input image at ith view. B i(x; y) is calculated as B i(x; y) =G(x; y) 3 r f @I i(x; y) @x g 2 + f @I i(x; y) g 2 ; (2) @y where G(x,y) is Gaussian operator and I i(x; y) is input images at ith view. The second criterion (f 2 ) is dened as the following equation. f 2 = 0 XX x;y i ft i(x; y) 0 P T j j(x; y) g 2 (3) p

where T i(x; y) is represents texture pattern back-projected from ith input image, and p is the number of the images back-projected on the model plane. If the model parameters are exact, every back-projected texture must be the same, then f 2 have to be 0 (maximum value). As these two criterion, the total evaluation function f is f = f 1 + f 1f 2; (4) where is a weighting constant. The second criterion (f2) supports the rst criterion (f 1). We previously used the following equation f = f 1 + f 2; (5) Camera1 Camera2 which sometimes gives high evaluation f even if the model does not correspond to the object input images, because f 2 can have the high evaluation value when the object surface and the background have the smooth texture. Therefore, the second criterion (f2) support the rst criterion (f1), then the second term (f1f2) cannot have high evaluation until f 1 has high evaluation to some degree. terion Fig. 4. Consistency of texture in cri- 2 Sharing Although GAs are generally used for nding an single solution having maximum evaluation, they can be used for nding multiple solutions simultaneously by some modications of the algorithms. Sharing method [9] is one of the popular modication for nding multiple solutions eciently. We employ the sharing method for nding multiple models which are included in the input images. Sharing is an operation that decreases evaluation of strings if the strings are similar to other strings in the population of strings. By this operation, strings tend to have various values, which means that multiple solutions can be simultaneously found. Original evaluation of a string is modied according to the number of the neighbor strings. The relationship between the modied evaluation and the original evaluation is shown in the following equation. E s(x i)= P n E(x i) j =1 s(d(x i;x j )) ; (6) where s(d) = max(1 0 d=; 0), x i;x j : ith and jth strings, E(x i) : original evaluation of x i E s(x i) : modied evaluation of x i, d(x i;x j) : distance between x i and x j s(d) : sharing function of d, : constant determining eect of sharing n : number of the strings in a population In general, is a constant parameter. In our method, however, is changed in proportional to the number of generations in GAs. In the earlier generation, sharing operates eectively and many solutions can be held in the string population by the use of higher. In the later generation, has lower value, then, local optimization can be realized around the multiple solutions. Genetic operations The ow of the modeling process by GAs is already shown in Fig. 1. First, the strings are initialized at random. The number of strings are 256.

Next, evaluation value is calculated for each string by the use of equation 4. The original evaluation is modied by sharing function as described in the previous section. Some elite strings are then selected according to modied evaluation. The selected elite strings are improved by a local search method. In this way, the strings having higher evaluation are selected as the elite strings. Other strings which are not selected as elites are applied to genetic operations. Genetic operations are reproduction, crossover and mutation. First, the parent strings are selected according to modied evaluation. Then, the ospring strings are generated by one-point crossover. Some bits selected at random and reversed by mutation. This process is repeated. After repeated this process for the number of generations dened in advance, some strings are selected as the object models. In this method, the number of the models in the input images is not given to the algorithm. Therefore the object models is selected by following criterion. 1. Selecting the string having the texture evaluation (f 2) under a threshold level. 2. Selecting the string having enough distance to the object models which are already selected. 3 Experiments We performed some experiments for demonstrating that the proposed method is ecient for object modeling. In this experiments, some real images taken by CCD camera. The objects are a locker, a model of a roof of a house, and two tissue boxes. These objects are located in natural environment, so the taken multi-view images have various background scene. Input images is 256 2 256, 8bit gray-scale. The camera parameters of every image are previously measured by initial experiment. (a) First generation (b) 20th generation Fig. 5. Input images (c) Final generation (d) Other views Fig. 6. Result of rst experiment First, the experimental results for the locker are shown. Fig. 5 shows the input multiview images. In the rst generation, candidates of solution are selected at random as shown in Fig. 6(a). They are getting to be close in the twenties generation as shown

in Fig. 6(b). In the nal generation, the solutions are optimized completely as shown in Fig. 6(c). Then, we obtain the object model. Using the object model, we calculate the image form other view points (Fig. 6(d)). As shown in the edge images (Fig. 7), it is very dicult to make correspondence of these edges or points between the multi-view images, because many background edges in one input image cannot exist in other input images. Since the proposed method does not require the correspondence between the multi-view images, the exact modeling result can be obtained. Next, we show the experimental results for a model of a roof of a house in Fig. 8. In this experiment, the object is expressed by the one triangular prism model. In this case, only small texture variation is included in the background in the input images, that is dierent from the rst experiment for the locker. As described in the section of the evaluation function, if we use the simple evaluation function of eq.(5), the strings converge some fail solutions as shown in the object in Fig. 9(b). However,by using the evaluation function of eq.(4), the exact model can be obtained as shown in Fig. 8(b) although there are little texture variation. Fig. 7. Edge image (a) First generation (b) Final generation Fig. 8. Result of second experiment (a) Edge image (b) Example of fail solution Fig. 9. Result of simple evaluation function (a) First generation (b) Final generation Fig. 10. Result of third experiment Third, we show the experimental results for two tissue boxes. In this experiment, the object in the input images is needed to be represented two rectangle models. In the rst generation, candidates of solution are selected at random as shown in Fig. 10(a). The result of modeling is shown in Fig. 10(b). The object consists of two tissues in this case, such as an articial building is generally expressed by combination of some rectangle and triangle. The modeling results for two models have a larger modeling error than the modeling results for one model. This error is caused by the some diculty in obtaining the multiple solution via GAs with sharing method. This will be solved by the future study. In this study, GAs play the important role for obtaining the shape model eciently

because of performance of optimization of GAs. Fig. 11 shows the distribution of the evaluation function for valuable parameters. Because the distribution for all parameters can not be displayed, we show the distribution in respect to the parameters lx and ly. It is obviously recognized that the evaluation distribution has a lot of peaks. For optimizing in such evaluation distribution, the conventional methods are not suitable because they tend to give wrong solutions of local peaks. As described in papers on GA's applications, GAs are powerful algorithm in such situations. 4 Conclusion Evaluation value 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 255 127 ly 0 0 Fig. 11. The distribution of the e- valuation function lx 127 255 In this paper, we propose a method for object modeling from multiple view images using genetic algorithms (GAs). In the proposed method, generation and verication of model hypothesis are repeated, and the hypothesis which has the best evaluation is eciently searched by the use of GAs. Some results of object modeling experiments from real multiple images demonstrate that the proposed method can robustly generate model by using GAs. References 1. Mark D. Wheeler and Katsushi Ikeucti : \Sensor Modeling, Probabilistic Hypothesis Generation, and Robust Localization for Object Recognition". IEEE trans. PAMI, Vol. 17, No.3, pp.252-265 (1995) 2. Carlo Tomasi and Takao Kanade : \Shape and Motion from Image Stream : a Factorization Method". Technical Report in CMU, CMU-CS-92-104 (1992) 3. Camillo J. Taylor and David J. Kriegman : \Structure and Motion from Line Segments in Multiple Image". IEEE trans. on PAMI, Vol.17, No.11, pp.1021-1032 (1995) 4. B.Parvin and G.Medioni : \B-rep from Unregistered Multiple Range Image". in Proc.of the 1992 IEEE International Conf. on Robotics and Automation, Nice, France, May, 1992, pp.1602-1607 (1992) 5. T.Masuda, K.Sakaue and N.Yokoya : \Registration and integration of multiple range images for 3-D model construction". Proceedings of the 13th ICPR, vol.1, pp.879-883 (1996) 6. S.D.Cochran, G.Medioni: \3-D Surface Description from Binocular Stereo". IEEE Trans. on PAMI, Vol.14, No.10, pp.981-994 (1992) 7. W.B.Seales, O.D.Faugeras: \Building Three-Dimensional Object Models from Image Sequence". COMPUTER VISION AND IMAGE UNDERSTANDINDING, Vol.61, No.3, pp.308-324 (1995) 8. URL: http://www.wbs.or.jp/bt/sel/skv/index.htm 9. D.Goldberg: Genetic Algorithms in search, Optimization and Machine Learning, Addison-Wesley (1989) 10. H. Saito and M. Mori: \Object Modeling from Multiple Images Using Genetic Algorithms". Proceedings of the 13th ICPR, Vol. IV, pp.669-673, (1996)