A Neural Network Model for Storing and Retrieving 2D Images of Rotated 3D Object Using Principal Components

Similar documents
Controlled Information Maximization for SOM Knowledge Induced Learning

Journal of World s Electrical Engineering and Technology J. World. Elect. Eng. Tech. 1(1): 12-16, 2012

Optical Flow for Large Motion Using Gradient Technique

A Two-stage and Parameter-free Binarization Method for Degraded Document Images

Color Correction Using 3D Multiview Geometry

Prof. Feng Liu. Fall /17/2016

IP Network Design by Modified Branch Exchange Method

Effects of Model Complexity on Generalization Performance of Convolutional Neural Networks

RANDOM IRREGULAR BLOCK-HIERARCHICAL NETWORKS: ALGORITHMS FOR COMPUTATION OF MAIN PROPERTIES

Detection and Recognition of Alert Traffic Signs

Spiral Recognition Methodology and Its Application for Recognition of Chinese Bank Checks

A modal estimation based multitype sensor placement method

Image Enhancement in the Spatial Domain. Spatial Domain

Fifth Wheel Modelling and Testing

Lecture # 04. Image Enhancement in Spatial Domain

Slotted Random Access Protocol with Dynamic Transmission Probability Control in CDMA System

Point-Biserial Correlation Analysis of Fuzzy Attributes

Positioning of a robot based on binocular vision for hand / foot fusion Long Han

ANALYTIC PERFORMANCE MODELS FOR SINGLE CLASS AND MULTIPLE CLASS MULTITHREADED SOFTWARE SERVERS

Segmentation of Casting Defects in X-Ray Images Based on Fractal Dimension

A Memory Efficient Array Architecture for Real-Time Motion Estimation

EYE DIRECTION BY STEREO IMAGE PROCESSING USING CORNEAL REFLECTION ON AN IRIS

An Unsupervised Segmentation Framework For Texture Image Queries

Conservation Law of Centrifugal Force and Mechanism of Energy Transfer Caused in Turbomachinery

COLOR EDGE DETECTION IN RGB USING JOINTLY EUCLIDEAN DISTANCE AND VECTOR ANGLE

A Minutiae-based Fingerprint Matching Algorithm Using Phase Correlation

arxiv: v2 [physics.soc-ph] 30 Nov 2016

17/5/2009. Introduction

An Extension to the Local Binary Patterns for Image Retrieval

Frequency Domain Approach for Face Recognition Using Optical Vanderlugt Filters

Obstacle Avoidance of Autonomous Mobile Robot using Stereo Vision Sensor

A Shape-preserving Affine Takagi-Sugeno Model Based on a Piecewise Constant Nonuniform Fuzzification Transform

All lengths in meters. E = = 7800 kg/m 3

Cryptanalysis of Hwang-Chang s a Time-Stamp Protocol for Digital Watermarking

4.2. Co-terminal and Related Angles. Investigate

Improved Fourier-transform profilometry

A Full-mode FME VLSI Architecture Based on 8x8/4x4 Adaptive Hadamard Transform For QFHD H.264/AVC Encoder

Multi-azimuth Prestack Time Migration for General Anisotropic, Weakly Heterogeneous Media - Field Data Examples

Color Interpolation for Single CCD Color Camera

DYNAMIC STORAGE ALLOCATION. Hanan Samet

(a, b) x y r. For this problem, is a point in the - coordinate plane and is a positive number.

And Ph.D. Candidate of Computer Science, University of Putra Malaysia 2 Faculty of Computer Science and Information Technology,

Transmission Lines Modeling Based on Vector Fitting Algorithm and RLC Active/Passive Filter Design

Computer Graphics and Animation 3-Viewing

Research Article. Regularization Rotational motion image Blur Restoration

A New Finite Word-length Optimization Method Design for LDPC Decoder

On Error Estimation in Runge-Kutta Methods

Title. Author(s)NOMURA, K.; MOROOKA, S. Issue Date Doc URL. Type. Note. File Information

Coordinate Systems. Ioannis Rekleitis

Input Layer f = 2 f = 0 f = f = 3 1,16 1,1 1,2 1,3 2, ,2 3,3 3,16. f = 1. f = Output Layer

Towards Adaptive Information Merging Using Selected XML Fragments

3D Periodic Human Motion Reconstruction from 2D Motion Sequences

A Recommender System for Online Personalization in the WUM Applications

Extract Object Boundaries in Noisy Images using Level Set. Final Report

Massachusetts Institute of Technology Department of Mechanical Engineering

Cellular Neural Network Based PTV

Topic -3 Image Enhancement

Module 6 STILL IMAGE COMPRESSION STANDARDS

SYSTEM LEVEL REUSE METRICS FOR OBJECT ORIENTED SOFTWARE : AN ALTERNATIVE APPROACH

Bilateral Filter Based Selective Unsharp Masking Using Intensity and/or Saturation Components

Topological Characteristic of Wireless Network

An Assessment of the Efficiency of Close-Range Photogrammetry for Developing a Photo-Based Scanning Systeminthe Shams Tabrizi Minaret in Khoy City

Modelling, simulation, and performance analysis of a CAN FD system with SAE benchmark based message set

On the Forwarding Area of Contention-Based Geographic Forwarding for Ad Hoc and Sensor Networks

Image Registration among UAV Image Sequence and Google Satellite Image Under Quality Mismatch

Augmented Reality. Integrating Computer Graphics with Computer Vision Mihran Tuceryan. August 16, 1998 ICPR 98 1

Illumination methods for optical wear detection

A ROI Focusing Mechanism for Digital Cameras

High performance CUDA based CNN image processor

Modeling spatially-correlated data of sensor networks with irregular topologies

Any modern computer system will incorporate (at least) two levels of storage:

MapReduce Optimizations and Algorithms 2015 Professor Sasu Tarkoma

Approximating Euclidean Distance Transform with Simple Operations in Cellular Processor Arrays

Mobility Pattern Recognition in Mobile Ad-Hoc Networks

Desired Attitude Angles Design Based on Optimization for Side Window Detection of Kinetic Interceptor *

3D Reconstruction from 360 x 360 Mosaics 1

Computational and Theoretical Analysis of Null Space and Orthogonal Linear Discriminant Analysis

Hand Tracking and Gesture Recognition for Human-Computer Interaction

A Texture Feature Extraction Based On Two Fractal Dimensions for Content_based Image Retrieval

Hybrid Fractal Video Coding With Neighbourhood Vector Quantisation

Shape Matching / Object Recognition

Linear Ensembles of Word Embedding Models

Robust Object Detection at Regions of Interest with an Application in Ball Recognition

vaiation than the fome. Howeve, these methods also beak down as shadowing becomes vey signicant. As we will see, the pesented algoithm based on the il

Modeling Spatially Correlated Data in Sensor Networks

A Novel Automatic White Balance Method For Digital Still Cameras

An Optimised Density Based Clustering Algorithm

A Hybrid DWT-SVD Image-Coding System (HDWTSVD) for Color Images

Extended Perspective Shadow Maps (XPSM) Vladislav Gusev, ,

HISTOGRAMS are an important statistic reflecting the

Data mining based automated reverse engineering and defect discovery

CLUSTERED BASED TAKAGI-SUGENO NEURO-FUZZY MODELING OF A MULTIVARIABLE NONLINEAR DYNAMIC SYSTEM

Free Viewpoint Action Recognition using Motion History Volumes

ADDING REALISM TO SOURCE CHARACTERIZATION USING A GENETIC ALGORITHM

Prediction of Time Series Using RBF Neural Networks: A New Approach of Clustering

2D Transformations. Why Transformations. Translation 4/17/2009

3D inspection system for manufactured machine parts

Adaptation of Motion Capture Data of Human Arms to a Humanoid Robot Using Optimization

Effective Missing Data Prediction for Collaborative Filtering

Generalized Grey Target Decision Method Based on Decision Makers Indifference Attribute Value Preferences

Transcription:

A Neual Netwok Model fo Stong and Reteving 2D Images of Rotated 3D Object Using Pncipal Components Tsukasa AMANO, Shuichi KUROGI, Ayako EGUCHI, Takeshi NISHIDA, Yasuhio FUCHIKAWA Depatment of Contol Engineeng, Kyushu Institute of Technology, Kitakyushu, Fukuoka 84-855, Japan and Toshihio IDA Depatment of Electonics and Contol Engineeng Kitakyushu National College of Technology Kitakyushu, Fukuoka 82-985, Japan ABSTRACT A neual netwok model fo stong and eteving twodimensional (2D) images of otated thee-dimensional (3D) object is pesented, whee pncipal components of the 2D images ae used fo data compession. The netwok is fo examining how we can stoe and eteve huge amount of images, and how we can constuct a neual model of the mental otation which is supposed to play impotant oles in human peception. Numecal expements with the pesent model show that we can stoe and eteve a huge amount of 2D images due to the eigenspace method utilizing pncipal components, while achieving the calculation time fo eteving otated images being popotional to the otation angle. Keywods: Neual Netwok Model, Stong and Reteving 2D Images of Rotated 3D Object, Mental Rotation, Pncipal Components of 2D images, Data Compession. 1. INTRODUCTION A neual netwok model fo stong and eteving twodimensional (2D) images of otated thee-dimensional (3D) object is pesented. The pesent model, like ou othe models[1], [2], has been developed fom the following points of view. One is the engineeng point of view, whee an efficient method fo eteving otated images as well as the images tansfomed by tanslation, magnification, pojection, etc. is useful in the fields of image pocessing and compute vision fo measung otation angles, invaant patten matching, and so foth. Although we hee conside only 2D images of a otated 3D object, it is difficult to deal with because thee ae a huge amount of 2D images. The othe one is the psychological point of view, whee we would like to model mental otation and mental tansfomation [3], [4] because the mental tansfomation psychologically seems to play an impotant ole in human peception. Hee, we focus on the finding that the esponse time fo identifying a otated object is popotional to the otation angle. To ovecome the poblem that thee ae a huge amount of 2D images, we use the eigenspace method which is widely used fo data appoximation and compession in vaous fields such as compute vision, communication, statistics, and so on. The eigenspace method uses the pncipal components which ae deved by the pncipal component analysis (PCA). So fa, thee have been developed a numbe of PCA atificial neual netwoks which can lean data and output the pncipal components [8]. Hee, we do not ty to model a PCA neual netwok but use the pncipal components which ae supposed to be output by a PCA neual netwok. Fo eteving memozed data we ty to embed the psychological finding that the esponse time is popotional to the otation angle which howeve depends on whethe the image is otated in the pictue plane o in depth [5], [6], which suggests that 2D images fo diffeent otation axes had bette be pocessed diffeently, so we apply the eigenspace method to the images in each otation axis, espectively. In the following sections, we fist show the eigenspace method fo compessing otated images, and then we explain stong and eteving compessed images while intoducing a neual netwok model, and last we examine the pesent method by means of numecal expements. 2. IMAGE COMPRESSION AND STORING Pncipal Components of Images Suppose thee is a 3D (thee-dimensional) object as used in [3] which is pojected onto a 2D (two-dimensional) image consisting of N N pixels, and let x be an m(= N 2 ) dimensional vecto epesenting the oginal 2D image of the 3D object, and let x(a X, a Y, a Z, θ i ), x(a, θ i ) o x fo shot, be the zeo-mean vecto epesenting the 2D image of the 3D object otated by the angle θ i aound the otation axis a, (a X, a Y, a Z ) T, whee i and ae indices fo otation angles and axes, and X, Y and Z shows the coodinate axes. In this aticle, we use the images with N = 128, θ i = i 1 [degee] fo i = 1, 2,, n and n = 36 (see Fig. 1). The singula value decomposition (SVD) of a matx

Z which indicates that the appoximation eo becomes smalle as K becomes lage, and the cumulative popotion given by x θ i [degee] X Y XY Z coodinate system P K µ (K) j=1, d2 P n j=1 d2 (6) indicates how much the econstucted vecto esembles the oginal vecto x. Fo two econstucted image vectos and and the coefficient vectos = U (K) T, we have = U (K) T and x(1,,, 18) x(, 1,, 18) x(,, 1, 18) Fig. 1. Examples of 2D images; x indicates the oginal image, and the images with the expession x(a X, a Y, a Z, θ i ) ae the images otated by the angle θ i[degee] aound the axis a = (a X, a Y, a Z ) with espect to the XY Z coodinate system. X, [x 1, x 2,, x n ] R m n (m n 1) is given by X = U D V T, (1) whee matces U, [u 1, u 2,, u n ] R m n and V, [v 1, v 2,, v n ]R n n ae othogonal matces, D = diag[d 1, d 2,, d n] R n n is a diagonal matx, and d 1 d 2 d n. The vectos u and v ae called the left and ght singula vectos, espectively, and d ae called the singula values. The jth left singula vecto u is also the jth eigenvecto of the covaance matx XX T and d 2 i ae the eigenvalues of XX T. Futhe, u is called the jth pncipal component, especially fo statistical data, and the PCA neual netwoks descbed above can lean to output the pncipal components. The eigenspace method utilizes the patial Kahunen-Loéve (KL) expansion of the ode K(< n) given by, U (K) = KX j=1 p j u, (2) fo appoximating given data x, whee the matx is given by U (K) U (K), [u 1, u 2,, u K,,, ] (3) and the coefficient vecto is given by, (p 1, p 2,, p K,,, ) T, U (K) T x. (4) The mean appoximation eo is deved as e (K), 1 n nx i=1 x 2 = nx j=k+1 d 2, (5) Thus, we can use squae distance of 2 = and and 2. (7) fo evaluating the, whee we can educe the computational space and time because = (p 1, p 2,, p K,,, ) T is identical to a K dimensional vecto while is m = N 2 (> K) dimensional vecto. Stong and Matching Suppose we have stoed o memozed the coefficient vectos = U (K) T (M) x and the matx U (K) fo the angles θ i[m] fo i = 1, 2,, n and the axes a fo = 1, 2,, and then we have an input o peceived vecto to be decided whethe is one of the memozed x, and identify the angle θ j[m] and the axis a s[m] if = x sj[m], whee the subscpts [M] and [P ] ae used fo discminating memozed and peceived (o pesented ), espectively. To solve this eteving poblem, let denote the KL coefficient of with espect to U (K) as follows,, U (K) T. (8) Then, we have the squae distance to the memozed vectos as, (x [P ]) 2 (9) = and when = x sj[p ] we have U (K) T 2, (1) x sj[p ] = U (K) T U (n) s[p ] p(n) sj[p ] 2 p(k). (11) This equation shows that when K (< n) is sufficiently lage and is x sj[p ] = x (o s = and j = i), the squae distance takes the global minimum value ( o small value nea when thee is noise), although when K is too small may take small values nea even when and x ae not

x(,, 1, 9) x(1,,, 27) Fig. 2. Images with diffeent otation angles and axes to be discminated. the same. Thus, fo a sufficiently lage K we ae supposed to be able to identify the angle θ j[p ] = θ i[m] and the axis a s[p ] = a which achieves x sj[p ] = x by means of obtaining the smallest fo all memozed and i. Hee, with as smalle K as possible, we can take advantages of the patial KL coefficients, such that (1) the calculation time and space can be educed as descbed above, (2) the KL coefficients ae obust to noise since the noise is aveaged by means of the PCA scheme, (3) the squae distance x sj[p ] fo smalle K changes moe smoothly with the incease of the angle θ j[p ], thus we can seach the minimum of the distance moe stably as shown in the next section. 3. PROCESS TO RETRIEVE ROTATION ANGLES AND AXES Popety of Squae Distance vs. Angle Befoe descbing the eteving pocess, we would like to show how x sj[p ] changes fo the change of memozed angles θ i[m]. Namely, we hee fist show the esult of an expement, whee fo the oginal image x shown in Fig. 1 we use the images otated by the angles θ i[m] =, 1, 2,, 359 [degee] aound the axis a = (,, 1) T fo memozed images, and the two images shown in Fig. 2 fo peceived images. The esult is shown in Fig. 3, whee (a) shows the esult fo K = n = 36 using non-appoximated images, and (b) shows the esult fo K = 1 with the cumulative popotion µ K = 21.6% which does not seem so big but sufficiently big enough fo discminating the peceived images by means of the pocess to eteve otation angles shown in the next section. Futhe, the ange of the angles θ i[m] whose distance x sj[p ] fo x(,, 1, 9) is little than the minimum x sj[p ] fo x(1,,, 27) is 6 degees (fom 88 to 93 [degee]) fo K = 36 and 8 degees (fom 87 to 94 [degee]]) fo K = 1, which indicates the obustness of the images using K = 1 to the change of otation angle, although the ange is not so widened. Pocess to Reteve Rotation Angles Fo both (a) and (b) in Fig. 3, when we select the angle θ i[m] with x sj[p ] =, we can eject x(1,,, 27), accept x(,, 1, 9) and identify the angle θ i[m] = 9 which is coect fo this memozed squae distance d (x sj[p] ) squae distance d (x sj[p] ) 35 3 25 2 15 1 5 25 2 15 1 5 x sj[p] =x(1,,,27) x sj[p] =x(,,1,9) 5 1 15 2 25 3 35 memozed otation angle θ i[m] [degee] (a) K = n = 36 x sj[p] =x(1,,,27) x sj[p] =x(,,1,9) 5 1 15 2 25 3 35 memozed otation angle θ i[m] [degee] (b) K = 1 Fig. 3. The squae distance x sj[p ] of simila images x sj[p ] = x(,, 1, 9) and x sj[p ] = x(1,,, 27) shown in Fig. 2 to the images x otated by the angles θ i[m] = 1, 2,, 359 [degee] aound the axis a = (,, 1) T. axis a = (,, 1). Futhe fo obustness, we had bette select θ i[m] with the minimum of x sj[p ] less than a theshold d θ, whee d θ can be detemined as a value less than the minimum x sj[p ] fo x sj[p ] = x(1,,, 27) to be discminated. Actually, the minimum x sj[p ] fo xsj[p ] = x(1,,, 27) was 64.9 fo K = 36, and 35.5 fo K = 1. Instead of looking fo all θ i[m], fo faste seach and fo constucting the model fo the mental otation, we can seach successively as θ i[m] =, ±1, ±2,, ±18, whee negative θ i[m] indicates 36 θ i[m]. Then we can find out the angle θ i satisfying [M] min n (i 1)[M], d (K) (i +1)[M], dθ o. (12) Since this equation indicates that [M] is smalle than o equal to the theshold d θ as well as the local minimum with espect to the otation angle θ i (i = 1, 2, ), it can be equal to o vey nea to the global minimum. Since [M] fo K = 1 looks much smoothe than that fo K = n, the global mini-

d *[M] (x [P] ) x(.63,.55,.54, 254) x(.63,.53, 1.28,.57, 254) θ i*[m] x(.31,.27,.91, 177) x(.28,.92, 7.14,.26, 178) NET Fig. 4. Schematic diagam of the netwok NET =NET(a ) fo the otation axis a fo stong and eteving otation angles θ i. mum may be seached stably much moe fo K = 1. Pocess to Reteve Rotation Axes The above pocess to seach otation angles aound the otation axis a can be pocessed by the netwok NET shown in Fig. 4, whee the laye of pncipal components should lean and execute the PCA which is supposed to be ealized by a PCA neual netwok [8]. Futhe, on the topological laye the KL coefficient vectos p ae mapped topologically with espect to the otation angle θ i, whee this kind of topology peseving map is well known as the SOM (self-oganizing map) [7]. Although moe investigation on the netwok implementation may be inteesting, we put it fo futue eseach. By means of paallelly unning the netwoks NET fo all, the net NET which fistly outputs the esult θ i is supposed to indicate the coect otation axis a because the net NET is supposed to output only when the otation axis is coect as shown in the pevious section. 4. NUMERICAL EXPERIMENTS Fo memozed images, we have made 25 andom axes a = (a X, a Y, a Z ) fo = 1, 2,, 25 chosen fom the egion satisfying a X, a Y, a Z and < a 2 X + a Y + a 2 Z 1 and let a be momalised as a = 1. Fo each axis a, the oginal image x is otated by θ i = i 1 fo i = 1, 2,, 36, x(.1,.6,.79, 221) x(.13,.8, 6.91,.59, 22) Fig. 5. Examples of images and paamete values used fo peceived (left) and those ecognized (ght). Namely, on the lefthand side, the images ae geneated as peceived ones with the paamete values and fed to the eteval algothm, while on the ght-hand side, the images ae geneated fom the paamete values eteved. 1 8 6 4 2 6 12 18 24 3 36 Fig. 6. Relation between the calculation time and the angle. and apply the patial KL tansfom with the ode K = 1 to X = [x 1, x 1,, x 36] and obtain U (1) = [u 1, u 2,, u 1 ] and p (1) = U (1) x fo i = 1, 2,, 36. We geneated 5 images andomly fo peceived images, and an the eteve pocess and obtained the otation angles and the axes. Some examples ae shown in Fig. 5, which shows thee ae some eos in eteved paamete values, but they ae supposed to be eteved coectly. Fo all peceived images, the coect ecognition ate was 88.6%. Hee note that the axes fo peceived images wee geneated fom a X, a Y, a Z as fo the memozed ones, and 25 memo-

zed axes wee necessay fo this ecognition ate, and 1 = 25 4 axes ae supposed to be necessay fo all 3D axes since the axes fo a X, a Y, a Z is a quate of all axes (a X R a Y R and a Z ), whee the axis a = (a X, a Y, a Z ) with an angle θ is equivalent to the axis a = (a X, a Y, a Z) with an angle 36 θ. The calculation time is shown in Fig. 6, whee the pepocessing time fo obtaining the pncipal components via the SVD about 2 minutes is not included in the calculation time in Fig. 6. Hee, we used a pesonal compute with AthlonXP18+1.53MHz CPU and VineLinux 2.5. Fom the figue, we can seectj that the calculation time was popotional to the otation angle, which shows the same popety of the psychological findings [3]. 5. CONCLUDING REMARKS We have pesented a method fo stong and eteving 2D images of otated 3D object. As a esult of numecal expements, we could stoe a huge amount of 2D images due to the eigenspace method, and achieve the popety that the computational time to eteve a pesented image is popotional to the otation angle. Howeve, thee ae a numbe of poblems we would like to solve in the futue, such that fom the engineeng point of view we could not memoze the images fo all otation axes due to the memoy capacity, fom psychological point of view the pesent model has to be efined fo explaining othe psychological findings as well as the pesent one, and so on. 6. REFERENCES [1] S.Kuogi, T.Nishida, K.Yamamoto, Image tansfomation of local featues fo otation invaant patten matching, Poc. of ICONIP 21, vol.2, pp.693 698, 21. [2] S.Kuogi, T.Amano, Invaant patten matching using 3D image tansfomation of local featues, Poc. of JNNS 22, pp.117 12, 22. [3] R. N. Shepad and J. Metzle, Mental otation of thee-dimensional objects, Science, vol.171, pp.71 73, 1971 [4] R. N. Shepad and L.A.Coope, Mental images and thei tansfomations, MIT Pess: Cambdge, MA, 1982 [5] M.Pasons, Visual discmination of abstact mioeflected thee-dimensional objects at many oentations, Peception & Psychophysics, vol.42, pp.49 59, 1987. [6] N.Kanamo and Y.Takeda, The diffeence of mental pocesses between depth and plane otation in natual objects, Technical Repot on Attention and Cognition, No.24, 23. [7] T.Kohonen, Self-oganization and associative memoy, Spnge Velag, Belin, 1984. [8] K.I.Diamantaas, S.Y.Kung, Pncipal component neual netwoks, Jhon Wiley & Sons, Inc., 1996.