Internatonal Journal of Appled Mathematcs and Computer Scences Volume 2 Number 1 Mnng Image Features n an Automatc Two- Dmensonal Shape Recognton System R. A. Salam, M.A. Rodrgues Abstract The number of features requred to represent an mage can be very huge. Usng all avalable features to recognze objects can suffer from curse dmensonalty. Feature selecton and extracton s the pre-processng step of mage mnng. Man ssues n analyzng mages s the effectve dentfcaton of features and another one s extractng them. The mnng problem that has been focused s the groupng of features for dfferent shapes. Experments have been conducted by usng shape outlne as the features. Shape outlne readngs are put through normalzaton and dmensonalty reducton process usng an egenvector based method to produce a new set of readngs. After ths pre-processng step data wll be grouped through ther shapes. Through statstcal analyss, these readngs together wth peak measures a robust classfcaton and recognton process s acheved. Tests showed that the suggested methods are able to automatcally recognze objects through ther shapes. Fnally, experments also demonstrate the system nvarance to rotaton, translaton, scale, reflecton and to a small degree of dstorton. Keywords Image mnng, feature selecton, shape recognton, peak measures. I. INTRODUCTION T becomes extremely easy to obtan and store large I quanttes of data. However, research progress n mage mnng stll has a bg room for mprovement, partcularly n multmeda mages. One of the greatest challenges s devsng an effectve automatc recognton and categorzaton. In today s ndustral and ever more automated world, there s a strong need for robust and relable areas, such as medcal, manufacturng, autonomous navgaton and multmeda applcatons. A robust automatc recognton wthout a pror nformaton has been a prmary concern to many researchers n mage mnng [1]. The core ssue s tacklng the problem wthout any human nterventon n feedng nformaton from the begnnng to the end of the recognton process. In ths research paper, an Manuscrpt receved Aprl 21, 2004. M. F. Omar s wth Unverst Utara Malaysa, Perls, Malaysa. (jatz@yahoo.com). R. A. Salam s wth Unverst Sans Malaysa, 11800 Penang, Malaysa. (Tel: +60-4-6532486; Fax: +60-4-6573335; emal: rosalna@cs.usm.my). R. Abdullah s wth Unverst Sans Malaysa, 11800 Penang, Malaysa. (emal: rosn@cs.usm.my). N. A. Rashd s wth Unverst Sans Malaysa, 11800 Penang, Malaysa. (emal: nuran@cs.usm.my). automatc shape recognton s presented. Choosng the rght features for object representaton s another man ssue n ths area. Too much nformaton can lead to a slow and neffcent system, whereas too lttle nformaton can result n msclassfcaton. Therefore, choosng the rght features s one of the man problem, especally n developng a robust system. The selecton process must be carefully decded, partcularly as, once nformaton for an object s dscarded, t normally cannot be restored later. Ths s more challengng when the selected features need to be used for dfferent tasks. Another ssue n feature selecton s to reduce computatonal cost. Usng all avalable features to recognze objects can suffer from curse dmensonalty. Feature selecton and extracton s the pre-processng step of mage mnng. A lot of nformaton can been reduce f only shape outlnes of an object are consdered. The use of shape outlne s not a new dea and t has shown a sgnfcant results for a recognton system [2] [7]. Others have use shape, color and texture [8], [9]. The frst part of ths research s based on our early vson system. Early vson system plays an mportant part of our earler stage of percepton. One of ts functon at ths stage s the edge and bar detecton. At ths level t processes the vsual nformaton necessary for percepton and then brngs t to the hgher order sensory n the bran. In relaton to our early vson systems, shape outlne was used as the features for ths recognton system. The egenvector based method are dmensonalty reducton schemes and have been nvestgated for the mage mnng process n ths research. It was used for reducng the amount of data that need to be process. Ths wll produce an effectve and robust automatc recognton system. The mnng problem that has been focused s the groupng of features for dfferent shapes. Groupng objects accordng to ther shapes can provde a meanngful categores. It provdes a herarchcal model for recognton and classfcaton of objects that are defned purely through ther shapes. The approach assumes no pror nformaton regardng the geometrcal knowledge of the shape n term or scale, rotaton, locaton or partcular features. Through statstcal analyss, these readngs together wth peak measures a robust classfcaton and recognton process s acheved. Tests showed that the suggested methods are able to automatcally recognze objects through ther shapes. Fnally, experments also demonstrate the system nvarance to rotaton, translaton, scale, reflecton and to a small degree of dstorton. 44
Internatonal Journal of Appled Mathematcs and Computer Scences Volume 2 Number 1 II. MOTIVATION, METHODS AND ASSUMPTIONS A. Vson System The frst area of vsual processng s the retna of the eye. Retna does not only collect lght through the photoreceptors, but serves as a flter as well. Informaton from the retna s transmtted through the optc nerve to the lateral genculate nucleus (LGN). Ths s where t processes the necessary nformaton for percepton. From LGN, neurons that carry the vsual nformaton wll send t to the prmary vsual cortex. The prmary vsual cortex, contans neurons whch respond to varous features of the mage. The neurons respond most strongly to edges of a partcular orentaton [10]. Ths edgedetecton process s through the connecton from LGN to prmary cortex. Our studes was nspred by the front end vsual system. At ths stage the basc vsual nformaton, that s the edges s avalable for percepton. The vsual nformaton s then carry for further processng n our extra-strate vsual cortex. Recognton and moton processng happen at ths stage. Zenon Pylyshyn [11], n hs paper concluded that the output of early vson system conssts of shape representatons nvolvng at least surface layouts, occludng edges, where these are parsed nto objects and other detals allow parts to be looked up n a shape ndexed memory n order to dentfy known objects. In the research conducted, the vsual nformaton at the front end vsual system, that s the shape representaton, namely the shape outlne s used as an nput to the vson system. Therefore, the vsual nformaton from the shape outlne, together wth the knowledge that we have for an object, ths vson system s expected to recognze a gven objects f t has seen t before otherwse t wll start to learn new vews of new objects. B. Shape Outlnes Shape outlne wll be the man feature extracted from the mage and t was based on the human vson system. An edge followng technque was used for acqurng shape outlne readngs and storng them n a lst format. Ths technque s used assumng that there s no background nformaton on the mage. Ths s a new method for an outlne detecton based on edge followng technque. The reason why a new method was developed, nstead of usng an exstng method, was that, the outlne readng used n the prototype system needs to be n an ordered or sequental lst format. Another reason was the need for an automatc boundary detecton method. Ths cannot be obtaned from the Snakes [12] actve model, even though t has been used n a number of vson applcaton systems. In snakes, the ntal pont needs to be chosen by external force. Brownan Strng [13] seems to be an automatc boundary detecton method, but ts use s lmted to a number of applcatons and ts output s not an order, sequental lst of ponts taken at regular ntervals. The frst stage s data acquston where two-dmensonal mages that s a raster format were used. Intally the startng pont of an object n the mage need to be dentfed. Once the ntal poston was found, the edge was followed around the object untl the ntal pont was reached. Ths ntal pont of the outlne s determned by frng a number of smulated range fnders sensors from random postons at the border of the dsplay wndow pontng to ts centre untl a pont on the outlne s encountered. Ths can be seen n Fg. 1. As soon as an object s encountered by at least two nearby smulated sensors the pxel co-ordnates (x,y) wll be returned. Only one pont of these co-ordnates wll be used as the ntal pont. FIGURE 1 INITIAL POINT OF THE SHAPE OUTLINE In the next stage, three vrtual sensors were confgured. The reason why three sensors were used s because the average of these three sensors can be used to elmnate nose problem and can produced better results. These three sensors are confgured to be at least two pxels apart. These three sensors follow the object's outlne, recordng a lst of (x,y) postons. The frst thng that these three sensors wll do s to rotate untl the next readng s obtaned. All three sensors must ht the shape for a vald readngs. Rotaton angles for all three sensors are recorded. The rotaton angleθ can be computed as: where θ = 1 3 3 θ 0 θ = tan 1 ( y / x ) y x = y 0 y 1 = x 0 x 1 (x 0, y 0 ) s the ntal pont and s the range of 1 3 and θ s the average of the three angles. The above procedure s repeated untl the ntal pont, that s (x 0, y 0 ) s reached, where ths process wll be automatcally stopped. The ordered lst of all the angle rotaton values (the average), that s, the n measurements, θ, s constructed as: (1) (2) 45
Internatonal Journal of Appled Mathematcs and Computer Scences Volume 2 Number 1 θ = θ, θ,..., θ ] [ d θ the dfference from one angle to the next one. The lst d θ s the pre-processng stage of of the dfferental angles the mage mnng process. These data wll go through a process of dmensonalty reducton and normalzaton before beng used for tranng and testng n the recognton stage. C. Normalzaton and Dmensonalty Reducton Outlne readngs went through a process of transformaton whch nvolved normalzaton and dmensonalty reducton. Ths transformaton used egenvector based methods, whch can reduce the computatonal burden of pattern recognton algorthms and the mage mnng process. To ncrease the statstcal sgnfcance of the used samples, random nose were added to each outlne readng creatng new equvalent vews of the same object. The lst of d θ descrbed earler s computed durng the feature extracton process, s further fltered by calculatng the average of every three readngs. Each current value s substtuted wth the average readng. The reason for ths, s that n the takng of outlne readngs, readngs are sometmes affected by nose, and reduces the error created by nose. The new lst of d θ s computed as: 1 = + dθ dθ / 3 1 where I = 2,3,..n-1. Therefore, a new set of d θ s obtaned. Let the lst of d θ be transformed nto lst of vectors V, where V = {v 1,v 2,,v n }. A vector v s defned as: v x = y mcos dθ = msn dθ The new co-ordnates after the transformaton can be constructed as follows: T C = E V ( Where, C C = { c, c,... c }) s the new set of coordnates after the transformaton, V V = { v, v,... v }) s ( the set of vectors computed from rotaton angles and E E = { e, e,..., e }) s the set of egenvectors. The ( egenvector e s computed as: (3) (4) (5) (6) where x 0 s 0 and ( x + ) k x 0 e = dθ k y dθ max (7) d θ max s the largest absolute value n the lst. k x and k y are arbtrary constant factors n the x and y axs. These constants are determned expermentally and play a very mportant role n determnng the new coordnates. The value chosen for k y was 50 and the value chosen for k x was 120. Normalzaton s carred out on C, where three of the values are added up and the average obtaned. The new set of readngs after normalzaton s Z, that s Z = { z1, z2,..., zn} and represent the new set of vectors. These new set of data s the data that produced n the mnng process. The data wll then go through the next stage that s the shape categorzaton process. Data that has been mned can be vsualze usng a graphcal format. An example of the graphcal representaton of a rectangular shape can be seen n Fg. 2. Peaks n the graph correspond to changes n shape, such as sharp corners. FIGURE 2 A RECTANGLE INITIAL AND ITS GRAPHICAL DATA REPRESENTATION OF THE SHAPE D. Pattern Recognton Pattern matchng durng the classfcaton stage represented another major task n ths research. Snce there s no pror nformaton of every new mages, any new data wll have to be traned and saved n the database. Ths s done by usng the unsupervsed classfcaton. The frst set of data wll go through the statstcal process, traned and saved n a database. The followng sets of data wll undergo the same process and saved n the database. The research concentrates on shapes recognton and dfferent shapes have ts own representaton. Smlar shapes wll be put n a same category and grouped properly. Ths s smlar on how human bran works where there s a shape-ndexed memory [10]. Data obtaned from the earler stage, were subject to 46
Internatonal Journal of Appled Mathematcs and Computer Scences Volume 2 Number 1 statstcal analyss, through the use of the z-scores method for the classfcaton of each pont n the lst. Matchng was accomplshed together wth the peaks and dstance measures for more accurate results. Assumng that the lst of ponts of each sgnature s normally dstrbuted [14]: 1 f ( y) = e σ 2π where y can assume all values from 2 1 y µ 2 σ (8) to + and the parameter µ and σ represent respectvely the mean and the standard devaton of the dstrbuton. Snce t s a contnuous probablty densty functon, the probablty that a pont y les between two specfed values a and b of a pont n the database s gven by an ntegraton: Pr b 1 = σ 2π 2 1 y µ 2 σ ( a y b) e dy a consderaton. The number of peaks can roughly determne the type of shape. As an example, a square or rectangle wll have four peaks and for a crcle, almost all of them are peaks. Complex objects can have any number of peaks. Straght lnes wll results n the value of y becomng zero. The dstance between peaks also provde the nternal relatonshp of a partcular shape. III. EXPERIMENTAL RESULTS Experments were conducted to test the shape outlne readng on a set of objects. Raster mages were used, wth the sze of between 300 X 300 pxels and 400 X 400 pxels. Shapes vares from smple to complex objects. Fg. 3 and Fg. 4 shows some examples of the used objects. Each shapes were recreated to 100 mages by addng nose before the tranng process began. Ths s to allow for a more flexble and robust shape recognton system. The above equaton can be smplfed {14] by carryng out the transformaton: y µ z = σ (9) (10) where z s the z -scores of observaton y and s the answer to `how many standard devatons away from the mean the observaton s'. The greater the number of standard devatons away from the mean the observaton s, the less lkely t wll have occurred by random chance. The most sutable value for z was determned based on the results of the experments. The value of z was between -1.96 and 1.96, that s 5 per cent of the dstrbuton (2.5 percent on each sde). If z les outsde ths range, then the pont s rejected and t does not belong to the lst stored n the database. At least 120 comparsons wth the stored lsts were made for each of the test sgnatures. Ths s repeated for other lsts stored n the database. The results may be that all ponts belong to a lst stored n the database, or not. If the results showed that 85 percent belong to the lst (also called the confdence nterval), that s, at least 100 ponts are correctly matched wth values stored n the database, or t can be sad that 15 percent are errors and do not belong or msclassfed then we conclude that the test object s the same to the stored object. If the correspondence s less than 85 percent, then the object does not belong to a partcular set of lsts. If the results are hgher than 85 percent, further tests wll be carred out to determne f the object s a complex object or a smple shaped object wth straght lnes. If the latter s the case, then the number of peaks wll be taken nto FIGURE 3 EXAMPLE OF SIMPLE SHAPES USED IN THE TRAINING PROCESS FIGURE 4 EXAMPLE OF COMPLEX SHAPES USED IN THE TRAINING PROCESS Fg. 5-8 show the graphcal representaton of dfferent shapes. It can be seen that the number of peaks showed the sharp corners of each shapes. Straght lnes wll produce zero 47
Internatonal Journal of Appled Mathematcs and Computer Scences Volume 2 Number 1 readngs. FIGURE 5 GRAPHICAL REPRESENTATION OF THE SHAPE TRIANGLE degrees, shows that the method used s nvarant to rotaton. Ths can be seen n Fg. 9. The accuracy level for all objects are above 95%. Translaton of objects s smlar to rotaton, ths s because snce the dfference n angles were used, and there s no problem n dentfyng a translated object. Results of 15 dfferent shapes that went through that were translated n x, y and xy drecton s shown n Table 1. TABLE Results On Shapes Translated n xy drecton FIGURE 6 GRAPHICAL REPRESENTATION OF THE SHAPE MOON FIGURE 7 GRAPHICAL REPRESENTATION OF THE SHAPE CIRCLE FIGURE 8 GRAPHICAL REPRESENTATION OF THE SHAPE STAR Frst set of experments were to test the extracton of the shape outlne from an object. Further experments were conducted to nvestgate that the system s nvarant to rotaton, translaton, sze and reflecton and to a certan degree of dstorton. Fg. 9 and Fg. 10 show the results of object beng rotated and object wth dfferent szes respectvely. FIGURE 9 TEST RESULTS ON SHAPE ROTATED BY 30 DEGREES Experments conducted for testng the nvarance n szes, shows that, the results for accuracy for 15 objects were above 95%. Ths can be seen n Fg. 10. Mrror effect or reflecton s another mportant aspect of object recognton. Readngs from the shape outlne s stored n a lst. Mrror effect can be obtaned easly by usng the reverse lst. Each vew of an object were tested through usng the followng lst: vew = y, y,..., ] [ 120 119 y1 (11) An object wll be not be classfed as the same object when t s reflected, however wth the use of the reverse lst, an object s easly classfed. Results obtaned for an object, that s the non-reversed lst wll create a totally new object. Results obtaned from the test for 15 objects rotated at 30 48
Internatonal Journal of Appled Mathematcs and Computer Scences Volume 2 Number 1 FIGURE 10 TEST RESULTS ON SHAPES INCREASED BY 10 PERCENT AND DECREASED BY 20 PERCENT Experments were carred out on a small degree of dstorton and translaton. Objects were translated nto x, y and xy drecton for testng the accuracy level of translated objects. The accuracy level are all above 95% for the 15 objects. Objects were dstorted by applyng the dstorton faclty provded by Corel Draw, usng the dsplacement map. Ths was done by alterng 20% horzontally and vertcally on the dsplacement map. Results for these tests were above the accuracy level. Further tests were conducted by further dstortng all of the objects. Results shows that, accuracy level were acheved for a maxmum of 30% of dstorton. When the dstorton on the dsplacement map were ncreased above 30%, the method faled to classfy these objects. Further experments were conducted on new objects. Ths s to further test the systems on the shape categorzaton. Dfferent shapes wll be classfed dfferently. Example of fve test shapes that were used can be seen n Fg. 11. Results of these fve test shapes s shown n Table 2. If the new object does not belong to any exstng group of shapes, new group wll be automatcally created. Ths new shape wll be stored n a new category. Results showed that recognton that based on only matchng ponts were not accurate. Peaks and the dstance from peaks are essental to dentfy whether a shape can be decded to be categorzed as a same group or not. TABLE 2 RESULTS OF THE NEW SHAPES TOWARDS THE TRAINED SHAPES FIGURE 11 EXAMPLE OF FIVE TESTS OBJECTS Smple objects normally have more straght lnes compared to complex objects. When there are straght lnes or curves, then the system can easly dentfy two dfferent objects. Straght lnes wll have zero angle dfference and therefore t s much easer to dentfy objects wth the same shape. Matchng and recognton process wll be tougher when there are too many straght lnes or curves n an object. The system wll try to classfy them as the same object even when t s dealng wth two dfferent objects such as a rectangle and square. Usng the data from the earler stage wll classfy them as the same object. However, groupng the data obtaned wth the number of peaks and the dstance from one peak to another can solve the recognton problem. Dstance between peaks wll show how closely one sharp corner from one to another. As an example the dstance of a peak from a rectangle wll be dfferent from a square. Therefore these two objects wll be classfed dfferently however they wll be put n the same category, snce both wll have four peaks. The rectangle n Fg. 2 compared to the rectangle (shape 5 n Fg. 11) wll not be classfed as the same object snce the number of peaks match s only 50%. However, they wll be put under the same category, because of the number of peaks that they have. Every tme a new object was ntroduced, the system wll automatcally calculate the shape outlne, the number of peaks and the dstance measure. In most cases, the system managed to dentfy the object, f the object s closely matched wth exstng objects. V. CONCLUSIONS Methods used n ths system has shown the capablty of an automatc recognton and categorzaton of shapes. Images were put through the pre-processng process of mage mnng and data produced were grouped together through ther shapes. Results can be vsualzed n graphcal format and dfferent shapes can be seen clearly from ths graphcal representaton. Results whch were stored n lst went through a statstcal method, usng z - scores, peak measure and were 49
Internatonal Journal of Appled Mathematcs and Computer Scences Volume 2 Number 1 used to classfy and recognze smple and complex objects. Experments showed that the system s nvarant to rotaton, translaton, sze, mrror effect and to a certan degree of dstorton. Peaks measure are essental for recognzng objects of dfferent shapes, as the shape outlne tself s nsuffcent. A larger number of peaks or hgher readngs occur where there s a sgnfcant change n shape, such as a sharp corner or a curve. Dfferent shapes wll be grouped accordngly. The system s capable of groupng new objects, that are objects that do not belong to any exstng category by puttng t nto a new category. Ths s acheved wthout any human nterference. The research were carred out to test the capablty of producng an automatc shape recognton system by mnng relevant mage features. From the experments and the results t showed that the method s capable of producng a generc automatc shape recognton system that s nvarant to rotaton, translaton, sze and to a certan degree of dstorton. The method can be extended to three-dmensonal objects, whch s currently under nvestgaton. Color, depth and texture can be grouped together to form a set of new features. Selectng and groupng these data can be another part of the data mnng process. The current method wll be tested wth a much larger scale of mages. The current system s lmted n classfyng and recognzng objects wth a greater dstorton. Ths wll be look nto n the next comng project. In comparson wth other methods such as neural networks, the next stage of the research could carry out a real comparson wth the same data for both methods. Another possblty s the combnaton of both methods, and ths would be a very useful area of nvestgaton. [11] Z. Pylyshyn, Is Vson Contnuous wth Cognton? - The Case for Cogntve Impenetrablty of Vsual Percepton. Techncal Report TR- 38, 1998, Rutgers Center for Cogntve Scence, Rutgers Unversty, New Brunswck, NJ, Avalable: http://ruccs.rutgers.edu/ publcatonsreports.html [12] M. Kass, A. Wtkn, and D. Terzopoulos, Snakes: Actve Models. Internatonal Journal of Computer Vson, 321-331, 1988. [13] R. P. Grzeszczuk and D. N. Levn, Brownan Strngs: Segmentng Images wth Stochastcally Deformable s. IEEE Transactons on Pattern Analyss and Machne Intellgence, 19, 1100-1114, 1997. [14] H. Mulholand and C. R. Jones, Fundamental of Statstcs. London Butterworths, London, 1968. Rosalna Abdul Salam was born n Penang, Malaysa on 13 th of January 1968. She receved her Bachelors degree n Computer Scence n 1992 from Leeds Metropoltan Unversty, Unted Kngdom. She receved a scholarshp to pursue her Masters degree and PhD. She receved her Masters degree n Software Engneerng from Sheffeld Unversty, Unted Kngdom n 1997. She completed her PhD n 2001 from Hull Unversty n the area of artfcal ntellgence and mage processng. She was a system analyst n Intel Penang, from 1992 to 1995. She s currently a lecturer of the School of Computer Scences, Unverst Sans Malaysa and a member of Artfcal Intellgence Research Group. She has publshed more than 25 papers n journals and conferences. Her current research area s n the area of artfcal ntellgence, mage processng and bonformatcs applcatons. Marcos Aurelo Rodrgues receved hs BEng n Mechancal Engneerng from the Federal Unversty of Santa Catarna (Brazl) n 1983. He was awarded an MSc n Computer Scence n 1989 and a PhD n Computer Scence n 1991, both from the Unversty of Wales, Aberystwyth. He has been apponted a Reader n Intellgent Systems wthn the School of Computng and Management Scences at Sheffeld Hallam Unversty n January 2000 and awarded a Personal Char n Computer Scence n February 2003. He has publshed over 100 techncal papers n nternatonal journals and conferences on the subjects of robotcs, computer vson, pattern recognton, systems modellng and artfcal ntellgence. Hs man current research nterests nclude 2D and 3D machne vson, machne learnng, and pattern recognton REFERENCES [1] J. Zhang, W. Hsu, and M. L. Lee, An Informaton-drven Framework for Image Mnng, n Proceedngs of the 12 th Internatonal Conference on Database and Expert Systems Applcatons (DEXA), Munch, German, 2001. [2] I. Berderman, and G. Ju, Surface vs. Edge-based Determnants of Vsual Recognton. Cogntve Psychology, 20, 38-64, 1988. [3] W. G. Hayward, Effects of Outlne Shape n Object Recognton. Journal of Expermental psychology: Human Percepton and Performance, 24(2), 427-440, 1988. [4] I. Taylor and M. M. Taylor, The Psychology of Readng. London and New York Academc Press, 1983. [5] I. Rock, F. Halper, T. Clayton, The Percepton and Recognton of Complex Fgures. Cogntve Psychology, 3, 655-673, 1972. [6] R. N. Haber, R. Haber, Vsual components of the Readng Process. Vsble Language, 15, 147-182, 1981. [7] R. G. Crowder, The Psychology of Readng. Oxford Unversty Press, 1982. [8] A. Jan, A. Valaya, Image Retreval usng Color and Shape, Pattern Recognton, 29(8), 1233-1244, 1996. [9] W. Ma, Y. Deng, and B. S, Manjunath, Tools for Texture/Color Based Search of Images, SPIE Internatonal Conference, Human Vson and Electronc Imagng, 497-507, 1997. [10] K. Schulten, The Development of the Prmary Vsual Cortex. Theoretcal Bophyscs Group, Beckman Insttute, Unversty of Ilons, USA, Avalable : http://www.ks.uuc.edu/research/neural/development. html, (16 th September 2002). 50