Content-Based Bird Retrieval using Shape context, Color moments and Bag of Features

www.ijcsi.org 101 Content-Based Brd Retreval usng Shape context, Color moments and Features Bahr abdelkhalak 1 and hamd zouak 2 1 Faculty of Scences, Unversty Chouab Doukkal, Equpe: Modélsaton mathématque et nformatque décsonnelle El Jadda, Morocco 2 Faculty of Scences, Unversty Chouab Doukkal, Equpe: Modélsaton mathématque et nformatque décsonnelle El Jadda, Morocco Abstract In ths paper we propose a new descrptor for brds search. Frst, our work was carred on the choce of a descrptor. Ths choce s usually drven by the applcaton requrements such as robustness to nose, stablty wth respect to bas, the nvarance to geometrcal transformatons or tolerance to occlusons. In ths context, we ntroduce a descrptor whch combnes the shape and color descrptors to have an effectveness descrpton of brds. The proposed descrptor s an adaptaton of a descrptor based on the contours defned n artcle Belonge et al. [5] combned wth color moments [19]. Specfcally, ponts of nterest are extracted from each mage and nformaton s n the regon n the vcnty of these ponts are represented by descrptors of shape context concatenated wth color moments. Thus, the approach bag of vsual words s appled to the latter. The expermental results show the effectveness of our descrptor for the brd search by content. Keywords: Shape context, nterest pont, Color moments, Vsual word, Brd. 1. Introducton Image search n general and CBIR n partcular are the search felds n of nformaton management whch a large number of methods have been proposed and studed but generally satsfactory solutons do not exst. The need for adequate solutons s growng due to the ncreasng amount of mages produced dgtally n varous felds, whch requres new ways to access the mages. In ths paper we dscuss a specal case of mage search by content such as content-based retreval of brds. The choce of ths type of mages comes to dversty and the large number of brds exsts n the world. For effectve research, an mportant crteron comes nto play based on the qualty of descrpton. In ths paper we combned the shape and the color descrptors. The choce of these descrptors s motvated by the nature of the type of mages search. Indeed, we fnd two brds have the same shape, but they are completely are dfferent n color. For th s reason we proposed to combne both color and shape to have an effectve descrpton. For the shape descrptor we choose to use the shape context [5]. The choce of ths descrptor s valorzed by ts effectveness n several applcatons [16, 17]. For the color descrptor, we use the color moments [19], ths descrptor have been combned wth dfferent descrptor to produce effcent descrpton [7, 18,20]. To search smlar brds, a smlarty measure between descrptors must be defned. However, the comparson between two brds s not always easy and sometmes t requres some nvarance transformatons. In addton, the sze of the descrptor s often hgh, whch ncreases the complexty durng the search smlar objects n a large collecton. To do ths, the descrptors are grouped nto classes to construct a vsual vocabulary [4], whch results n a process of abstracton. Each class s consdered a vsual word, and the shape contexts [5] combned wth the color moments belongng to the same class share smlar nformatons regardless of the mage where the pont of nterest s extracted. Fnally, each brd s descrbed by vsual words and pared wth the brd query. Several methods have been proposed to extract feature characterstcs of brds, for example Ln et al. [1] proposes a shape ontology framework whch ntegrates vsual and doman nformaton, appled to brd classfcaton based on both doman and vsual knowledge. In [2,3], the authors use of automated methods for brd speces dentfcaton. Ths paper proposes a descrptors whch ntegrates shape context [5] and color moments appled to brd search based on the BOF[4] approaches. Ths paper s organzed as follows. In Secton 2, we present the shape context descrptor to descrbe a brd. Our contrbuton to ndexng brds s dscussed n Secton 3. Then, we present an

www.ijcsi.org 102 evaluaton of our approach n secton 4. Fnally, n Secton 5, we conclude and gve perspectves to our work. 2. Background of shape context The shape context to a contour pont p of a shape s determned by the dstrbuton of contour ponts n the regon n the vcnty of p [5]. Is a hstogram of the relatve coordnates of contour ponts wth respect to p ponts that are the reference ponts. One shape s represented by the set of sampled ponts on external and nternal boundares, C = {p 1, p 2,.. p n }, p R 2 where n s the number of contour ponts. For a pont p, the relatve coordnates of the n-1 other ponts are determned. The coordnates are the coordnates of the pont n a log-polar coordnate systems usng p as the orgn: q (log( r ), ), q p q C (1) q q where r q s the dstance between p and q, θ q s the angle between the vector pq and the horzontal axs. The shape context h of pont p s defned as: h ( l ) nomber{ q p :( q p ) bn( l )}, l 1, L (2) Where h (l) s the number of contour ponts belongng to the l th class of the hstogram and bn (l) = {(r q, θ q ): r q [r l, r l + Δ rl ] θ q [θ l, θ l +Δ θl ]}. Shape of the object O s descrbed as the set of shape contexts of contour ponts: O { h p C} (3) However, the shape context descrbed above s not nvarant to rotaton and scalng. For nvarance to scalng, the radal dstances are normalzed by the average dstance α of n 2 pars of ponts of the shape [5]. The authors also suggested usng the tangent vector assocated wth each pont rather than the absolute horzontal axs for that the shape context s nvarant to rotaton. In the lterature a descrpton method (CFPI[6]) s proposed. Ths method s an adaptaton of the shape context descrptor [5] to graphc symbols. Specfcally, ponts of nterest are extracted from each symbol and nformaton n the regon n the vcnty of these ponts s represented by shape context descrptors. In our case, we combne the method proposed n [6] wth we addng the color nformaton s of each regon, then we apply the BOF[4] to them. The detals of our approach are presented n the next secton. 3. Proposed descrptor Accordng to [5], the dstance between two shapes s measured as the sum of the costs of symmetrcal best matches shape contexts. Ths causes a problem of complexty when tryng shapes (objects) among many smlar canddates. To reduce the complexty of matchng calculaton between objects, the approach bag of vsual words [4] wll be operated. The detals of our approach wll be descrbed as follow: In the frst step we detect the nterest ponts for the brd. The researches descrbed n [8, 9, 10, 11] showed that an object can be effcently located from ts ponts nterest. There are many methods proposed to detect nterest ponts n an mage [12, 13, 14]. We chose the DoG (Dfference-of-Gaussan) detector presented n [8] for our experments. In the second step, we calculate for each nterest pont t shape context. Each pont of them s consdered as reference pont to calculate t correspondng shape context. So that the object s well represented regardless of ts orentaton and sze, the descrptor should be nvarant to rotaton and scale, the relatve coordnates of the contour ponts should be normalzed. For an effectve descrpton of each nterest regon, the Color Moments are used to extract the color features from the regon of 5x5 pxels around the nterest pont. Snce most of the Color nformaton s are concentrated on the low order moments, only the frst moment (mean) and the second moments (varance) wll be used as the color features. In the fnal step, the Algorthm to buld vsual words [4] s appled on the set of shape contexts concatenate wth frst and second color moments to buld a bag of vsual word that are nclude both the shape and color nformaton s. 4. Experment results Let us frst ntroduce the brd dataset we used n our experments and the performance evaluaton measures. We wll then present the expermental results. 4.1 The Brds Dataset and Performance Evaluaton To conduct the expermental results, we wll focus on the CUB-200[15] dataset. Ths dataset contans 11788 mages

www.ijcsi.org 103 of 200 brd speces. In our experments we choose 30 mages for each the 200 brd speces. 4.2 Results Before ndexng the brds by our method, we computed the set of shape contexts on a database of brds. We use the k- means algorthm to defne the vsual vocabulary, and then we ndexed each brd accordng to ths vocabulary. In the second tranng process, Shape context, the frst, and the second color moments of each nterest ponts are extracted from each brd n the database, and then are concatenated. After the K-means clusterng s used to buld another vsual vocabulary. In the end, each brd of the database s ndex accordng the new vocabulary. To check the performance of our proposed approach the precson and recall s used. The standard defntons of these two measures are gven by followng equatons. Number of relevant mages retreved Pr ecson (4) Number of mages retreved Number of relevant mages retreved Re call (5) Total number of relevant mages n the database Let us frst take a look at the qualtatve results. We can see n fgure 2 some precson/recall curve results. Fg. 1 Sample of WANG Image Database

www.ijcsi.org 104 n=500 0.9073 0.747 0.9155 0.824 0.92 0.849 n=600 0.911 0.797 0.921 0.874 0.953 0.899 n=700 0.922 0.847 0.98 0.924 0.964 0.949 Average Precson 0.8603 0.722 0.8771 0.799 0.8828 0.824 5. Concluson and future works In ths paper we have developed and tested a method for mage retreval based on the percepton of color and shape descrptors. The proposed descrptor explots the shape context, the color moments together wth bag of vsual words approach to descrbe brds. The expermental results are promsng. We plan to extend the work to avod the severe problems occurrng n most exstng methods, such as the slow tranng (e.g., n K-means). In the other axs, we try to avod determnng manually the sze of the vsual vocabulary by explotng the learnng technques. References [1] H. Ln, H. Ln, and W. Chen, Study on recognton of brd speces n mnjang rver estuary wetland, Proceda Envronmental Scences, vol. 10, Part C, pp. 2478 2483, 2011. Fg.2: The precson vs. recall curves by our descrptor and bag of shape context for three vocabulary szes. several parameters come nto play for the qualty of result search. The frst parameter relate the choe of the number sampled ponts of the contour. the second parameter s the sze of the vsual vocabulary. The table 1 summarzes some results of 10-KNN for dfferent values of vsual vocabuly szes and the number of sampled ponts selectonned from brd contour. Table 1. average preceson by vocabulary szes and number of sampled ponts for brd contors. Vocabulary sze Sampled ponts 0ur descrptor Average Precson k=200 k=300 k=400 Shape 0ur Shape 0ur context descrptor context descrptor Shape context n=200 0.7573 0.597 0.7655 0.674 0.77 0.699 n=300 0.8073 0.647 0.8155 0.724 0.82 0.749 n=400 0.8573 0.697 0.8655 0.774 0.87 0.799 [2] R. Bardel, D. Wolff, F. Kurth, M. Koch, K.-H. Tauchert, and K.-H. Frommolt, Detectng brd songs n a complex acoustc envronment and applcaton to boacoustc montorng, Pattern Recognton Letters, vol. 31, no. 12, pp. 1524 1534, 2010. [3] T. S. Brandes, Automated sound recordng and analyss technques for brd surveys and conservaton, Brd Conservaton Internatonal, vol. 18, pp. 163 173, 2008. [4] SIVIC J., ZISSERMAN A.: Vdeo Google: A text retreval approach to object matchng n vdeos. In Proc. ICCV (2003), vol. 2, pp. 1470-1477. [5] BELONGIE S., MALIK J., PUZICHA J., Shape Matchng and Object Recognton Usng Shape Contexts, PAMI, vol. 24, no 4, 2002, pp. 509-522. [6] O. Nguyen, S. Tabbone and A. Boucher Une approche de localsaton de symboles non-segmentés dans des documents graphques, Tratement du sgnal, 26(5), 2009/2010. [7] S. Mangjao Sngh, K. Hemachandran, Content- Content- Based Image Retreval usng Color Moment and Gabor Based Image Retreval usng Color Moment and Gabor Texture Feature Texture Feature, IJCSI Internatonal Journal of Computer Scence Issues, Vol. 9, Issue 5, No 1, September 2012, ISSN (Onlne): 1694-0814

www.ijcsi.org 105 [8] LOWE D. G., Dstnctve mage features from scale-nvarant keyponts, IJCV, vol. 60, no 2, 2004, pp. 91-110. [9] SIVIC J., ZISSERMAN A., Vdeo Google : Effcent Vsual Search of Vdeos, Toward Category-Level Object Recognton, vol. 4170/2006, pp. 127-144, Sprnger Berln / Hedelberg, 2006. [10] AGARWAL S., AWAN A., ROTH D., Learnng to detect objects n mages va a sparse, part-based representaton, PAMI, vol. 26, no 11, 2004, pp. 1475-1490. [11] BOSCH A., ZISSERMAN A., MUNOZ X., Scene Classfcaton va plsa, Computer Vson, ECCV 2006, vol. 3954/2006, pp. 517-530, Sprnger Berln / Hedelberg, May 2006. [12] SCHMID C., MOHR R., BAUCKHAGE C., Comparng and evaluatng nterest ponts, ICCV, Bombay, Inda, 1998, pp. 230-235. [13] MIKOLAJCZYK K., SCHMID C., A performance evaluaton of local descrptors, PAMI, vol. 27, no 10, 2005, pp. 1615-1630. [14] TABBONE S., ALONSO L., ZIOU D., Behavor of the Laplacan of Gaussan Extrema, Journal of Mathematcal Imagng and Vson, vol. 23, no 1, 2005, pp. 107-128. [15]C. Wah, S. Branson, P. Welnder, P. Perona, and S. Belonge. The Caltech-UCSD Brds-200-2011 Dataset. Techncal Report CNS-TR-2011-001, Calforna Inst. Tech., 2011. [16] Marçal Rusñol, Josep Lladós, Effcent Logo Retreval Through Hashng Shape Context Descrptors. DAS 10, June 9-11, 2010, Boston, MA, USA. [17] S. Marna, B. Mott, and G. Soda, Mathematcal symbol ndexng, n Proc. Int l Conf. on of the Italan Assocaton for Artfcal Intellgence. Berln, Hedelberg: Sprnger-Verlag, 2009, pp. 102 111. [18] Velmurugan K., Lt.Dr.S. Santhosh Baboo Content-Based Image Retreval usng SURF and Color Moments, Global Journal of Computer Scence and Technology Volume 11 Issue 10 Verson 1.0 May 2011. [19]Strcker M.A. and M. Orengo. Smlarty of color mages. In SPIE, Storage and Retreval for mage Vdeo Databases, pages 381-392, 1995. [20] A.Bahr, H.Zouak,"A SURF-COLOR MOMENTS FOR IMAGES RETRIEVAL BASED ON BAG-OF FEATURES", European Journal of Computer Scence and Informaton Technology Vol.1 No.1, pp.11-22, June 2013.