A Refined Hybrid Image Retrieval System using Text and Color

Similar documents
A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

UB at GeoCLEF Department of Geography Abstract

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

Query Clustering Using a Hybrid Query Similarity Measure

A Binarization Algorithm specialized on Document Images and Photos

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Performance Evaluation of Information Retrieval Systems

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

COMPLEX WAVELET TRANSFORM-BASED COLOR INDEXING FOR CONTENT-BASED IMAGE RETRIEVAL

Cluster Analysis of Electrical Behavior

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

Querying by sketch geographical databases. Yu Han 1, a *

Semantic Image Retrieval Using Region Based Inverted File

An Image Fusion Approach Based on Segmentation Region

Object-Based Techniques for Image Retrieval

1. Introduction. Abstract

PRÉSENTATIONS DE PROJETS

Visual Thesaurus for Color Image Retrieval using Self-Organizing Maps

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval

A Clustering Algorithm for Key Frame Extraction Based on Density Peak

Combination of Color and Local Patterns as a Feature Vector for CBIR

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Parallelism for Nested Loops with Non-uniform and Flow Dependences

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION

Semantic Illustration Retrieval for Very Large Data Set

Efficient Mean Shift Algorithm based Color Images Categorization and Searching

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval

A Fuzzy Image Matching Algorithm with Linguistic Spatial Queries

Arabic Text Classification Using N-Gram Frequency Statistics A Comparative Study

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

Using Fuzzy Logic to Enhance the Large Size Remote Sensing Images

A fast algorithm for color image segmentation

Relevance Feedback for Image Retrieval

Searching Large Image Databases using Color Information

An Optimal Algorithm for Prufer Codes *

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches

Information Retrieval

An Improved Image Segmentation Algorithm Based on the Otsu Method

A Knowledge Management System for Organizing MEDLINE Database

A New Feature of Uniformity of Image Texture Directions Coinciding with the Human Eyes Perception 1

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

HIGH-LEVEL SEMANTICS OF IMAGES IN WEB DOCUMENTS USING WEIGHTED TAGS AND STRENGTH MATRIX

KIDS Lab at ImageCLEF 2012 Personal Photo Retrieval

Optimizing Document Scoring for Query Retrieval

User Tweets based Genre Prediction and Movie Recommendation using LSI and SVD

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Private Information Retrieval (PIR)

Detection of an Object by using Principal Component Analysis

Remote Sensing Image Retrieval Algorithm based on MapReduce and Characteristic Information

Local Quaternary Patterns and Feature Local Quaternary Patterns

Image Interpretation Based On Similarity Measures of Visual Content Descriptors An Insight Mungamuru Nirmala

TN348: Openlab Module - Colocalization

Enhanced Watermarking Technique for Color Images using Visual Cryptography

User Authentication Based On Behavioral Mouse Dynamics Biometrics

Keyword-based Document Clustering

Gender Classification using Interlaced Derivative Patterns

Face Recognition University at Buffalo CSE666 Lecture Slides Resources:

A Novel Video Retrieval Method Based on Web Community Extraction Using Features of Video Materials

Available online at Available online at Advanced in Control Engineering and Information Science

Learning-Based Top-N Selection Query Evaluation over Relational Databases

The Effect of Similarity Measures on The Quality of Query Clusters

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Classifier Selection Based on Data Complexity Measures *

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

Texture and Shape Content Based MRI Image Retrieval System N. Kumaran #, Dr. R. Bhavani #

An efficient method to build panoramic image mosaics

Efficient Color and Texture Feature Extraction Technique for Content Based Image Retrieval System

A Multi-step Strategy for Shape Similarity Search In Kamon Image Database

Classic Term Weighting Technique for Mining Web Content Outliers

Feature Reduction and Selection

S1 Note. Basis functions.

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

Local Tri-directional Weber Rhombus Co-occurrence Pattern: A New Texture Descriptor for Brodatz Texture Image Retrieval

Robust Shot Boundary Detection from Video Using Dynamic Texture

Improving Web Image Search using Meta Re-rankers

Alignment Results of SOBOM for OAEI 2010

Data Modelling and. Multimedia. Databases M. Multimedia. Information Retrieval Part II. Outline

Background Removal in Image indexing and Retrieval

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Content-Based Bird Retrieval using Shape context, Color moments and Bag of Features

Research and Application of Fingerprint Recognition Based on MATLAB

Combining Multiple Resources, Evidence and Criteria for Genomic Information Retrieval

Available online at ScienceDirect. Procedia Computer Science 46 (2015 )

Related-Mode Attacks on CTR Encryption Mode

Cross-lingual Pseudo Relevance Feedback Based on Weak Relevant Topic Alignment

Lecture 13: High-dimensional Images

A Novel Similarity Measure using a Normalized Hausdorff Distance for Trademarks Retrieval Based on Genetic Algorithm

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Video Classification and Retrieval with the Informedia Digital Video Library System

Problem Set 3 Solutions

A New Approach For the Ranking of Fuzzy Sets With Different Heights

HCMX: AN EFFICIENT HYBRID CLUSTERING APPROACH FOR MULTI-VERSION XML DOCUMENTS

Enhanced AMBTC for Image Compression using Block Classification and Interpolation

Module Management Tool in Software Development Organizations

Online Text Mining System based on M2VSM

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Transcription:

www.ijcsi.org 48 A Refned Hybrd Image Retreval System usng Text and Color Ndh Goel 1, and Prt Sehgal 1 Ph.D. Research Scholar, Unversty of Delh, New Delh, Inda Assocate Professor, Department of Computer Scence, Keshav Mahavdyalaya, Unversty of Delh, Ptampura, New Delh - 114, Inda. Abstract Image retreval (IR) contnues to be most exctng and fastest growng research areas due to sgnfcant progress n data storage and mage acquston technques. Broadly, Image Retreval can be Text based or Content based. Text-based Image Retreval (TBIR) s profcent n named-entty queres (e.g. searchng mages of TajMahal. Content Based Image Retreval (CBIR) shows ts profcency n queryng by vsual content. Both the technques havng ther own advantages and dsadvantages and stll have not been very successful n uncoverng the hdden meanngs/semantcs of the mage. In ths paper, we propose a hybrd approach that mproves the qualty of mage retreval and overcomes the lmtatons of ndvdual approaches. For text retreval, matchng term frequency-nverse document frequency (tf-df) weghtngs and cosne smlarty are used, whereas for content matchng the search space s narrowed down usng color moments and then the two results obtaned are combned to show better results than the ndvdual approaches. Further refnement usng color hstogram technque mproves the performance of the system sgnfcantly. Keywords: CBIR, TBIR, Color moments, Color Hstogram 1. Introducton and Related Work An mage retreval system s a computer system for browsng, searchng and retrevng mages from a large database of dgtal mages [1, 5]. Progress n varous domans lke vdeo, bometrcs, medcal applcatons, survellance, GIS, remote sensng, journalsm etc. resulted n the creaton of large mage datasets due to advancement n data storage and acquston technques [4]. As processng has now become ncreasngly powerful and memory has become cheaper, the deployment of large mage datasets for varous applcatons has relatvely become easer and effcent also. In ths scenaro, t s necessary to develop an approprate retreval system to effectvely and effcently manage such a large collecton of mage datasets. From past two decades Image retreval has evolved from text based retreval (198s) to content based retreval (199s) to Fuzzy Image retreval (4) []. In TBIR systems, the mages are retreved from the database based upon the text annotatons (or metadata) assocated wth the mages [6]. Apache Lucene s amongst the ntal TBIR systems that requres full-text search [7]. Chang et al. [8] have used an approach n whch they frst annotate the mage wth text and then use text-based database management system for IR. Research n the areas such as enhancements n data modelng, multdmensonal ndexng, and query evaluaton has generated many new systems. One representatve of such system s QPE [9] n whch queres are specfed n terms of only the defned vocabulary of the database. The user selects from those predefned queres only and the correspondng results are dsplayed. PICQUERY [1] s another text-based IR system whch provdes a hgh-level query language for specfyng the textual-query for pctoral database management system (PDBMS). TBIR s benefcal due to two reasons, frst s the user-frendlness as user can easly compose queres usng ther natural language.second, t provdes better results and s useful n applcatons where more semantc relatonshps are nvolved, thus helps n brdgng the semantc gap [11].However nowadays TBIR s laborous and tme consumng task of manual annotatons [1]. As text-annotatons are user-based.e. n what context the user s annotatng an mage dffers from user to user. For example: an mage of red car usera may annotate t as car.jpg, UserB can annotate t wth redcar.jpg or UserC can name t lke DSC_1.jpg. Ths may lead to nconsstency [1].Another dsadvantage assocated wth TBIR s that t provdes ht-mss type searchng of mages. If keyword provded matches wth annotated text then only mage s retreved otherwse the desred mage s unreachable. On the other sde f erroneously some mage s not annotated properly t may result n garbage data also. So a method was requred to produce more accurate results. Then the CBIR systems came nto exstence. CBIR s an IR technque whch s based on vsual content (Color, shape, texture and spatal layout) [, 4] to search mages from large mage databases. Xu et al.[14] says that mage retreval based on feature smlarty matchng extracted from the rch content of an mage performs better than the text query provded for the same. It has been observed that most CBIR approaches are based on the color vsual content. However the selecton and crtera for choosng vsual content changes wth applcaton. The test results ndcate that color hstogram

www.ijcsi.org 49 performs well compared to other descrptors when mages have mostly unform color dstrbuton. In most of the mages categores color moments also show better performance [4]. A CBIR system for Irs matchng uses color texture and shape as feature descrptor [15]. Another system Sketch4Match uses color as the feature descrptor for matchng of a sketch wth the correspondng mage [16]. CBIR gves better result where applcaton contans more of vsual content rather than semantc content. For e.g.:- CBIR s sutable for queres n medcal dagnoss nvolvng comparson of an X-ray pcture wth prevous cases. But suppose, a query demands retreval mages of cancerous tssues/cells, or A man holdng shootng camera are such examples where text outweghs vsual content [17] and t s not clear what knd of mage should be used. Now here CBIR fals because of the fact that vsual features cannot fully represent the semantc concepts. As dscussed above, we can conclude that both TBIR and CBIR have ther own characterstcs, advantages and dsadvantages. Low level vsual features of an mage represents the more detaled perceptual aspects whle text addresses the hgh level semantcs underlyng the more general conceptual aspects of an mage. Efforts have been made by the researchers to combne these two approaches to provde us wth satsfactory results [18]. In 1999 an mportant research work s done focusng on Content based retreval nspred from text retreval [19]. Abbas et.al. [] suggests that combnaton of both text and content could better the performance of search systems benefttng both the approaches. It s based upon the dea f ablty to examne the mage content does not exst, then search depends upon metadata lke captons/keywords.t says that TBIR s as fast as CBIR. An effort has been made to combne content and semantcs n Medcal doman also [1]. The paper proposes a scheme to combne CBIR and semantcs usng grd computng The MedImGrd used n the proposed CSBIR extracts semantc context nformaton and use t as clue to mprove effcency. The shape, color and texture features of every medcal mage for thorax CR based CSBIR s extracted and Eucldean dstance metrc s used for the smlarty matchng of vsual content of the mage. A recent study says content and context s requred to brdge the gap rather than content vs. context [11]. Ths paper gves four reasons to support ths and present approaches that approprately combne content wth context to brdge the semantc gap. They suggested a new drecton based upon phlosophy, cogntve scence and modern search engnes that can be easer to brdge the semantc gap. C. Hartvedt [17] dscusses how combnng exstng technques may help mprove the understandng of user ntentons n IR. Underlyng hypothess s that such an approach wll make t easer for an IR system to understand the user s ntenton behnd a search. But yet fndng the correlaton between low level features and hgh level concepts to brdge the gap between vsual features and semantc content has been a major challenge n ths research feld. In ths paper, an approach s proposed to combne these two features. We also show that combned approach produces better results than ndvdual approaches. The proposed system further flters the results obtaned based on vsual content before dsplayng the fnal results whch substantally decreases the number of non-relevant mages retreved. The rest of the paper s organzed as such that secton dscusses the archtecture of the proposed system n detal, secton evaluates and compares the performance of combned wth ndvdual systems and secton 4 concludes ths paper.. Proposed Algorthm The proposed IR system uses both the vsual content and text (metadata) assocated wth the mage. The system has user nterface where user can enter the query n the form of text and/or a query mage. The text and mage feature descrptor vectors are generated for query text and mage as well as the mages n Database. The smlarty matchng algorthm s executed on the descrptors vectors and the two ndependent lsts of score vectors are generated for the query text and query mage wth dfferent scores. Now these two lsts are combned n a sgnfcant way to gve user the combned score lst of mages. The default weght gven to both the methods s.5(5%).the relevant set of mages are retreved based upon the threshold value set. They are arranged n decreasng order of ther relevance. Results obtaned are fltered agan usng color hstogram to remove unwanted mages. Fg. 1 shows the flowchart for the proposed system..1 Pre-processng the text descrptons Documents (mages) are read one by one from the database collecton and the metadata such as ttle, descrpton, notes, locaton s extracted from them. Preprocessng functons for tokenzaton and stopword removal are appled on the processed metadata. The words left after flterng out markers that are not part of the text, punctuaton tokens, numbers, stopwords, etc. are stemmed usng Porter Stemmer algorthm []. The resultng terms from the prevous process are referred to as ndex terms or tokens. These tokens are then used for ndexng the documents. We call the set of these tokens as vocabulary. The same pre-processng s appled on the text of the query gven by the user.

www.ijcsi.org 5 Extract Metadata from xml fles of all the database mages Database Image Query Image and/or ndex s used for the fast access of the data []. Now after creatng nverted ndex weghts are assgned to the ndex terms/tokens. The term frequency-nverse document frequency (tf-df) weght of a term s the product of ts term frequency (tf) weght and ts nverse document frequency (df) weght.the tf(t,d) of term t n document d s defned as the number of tmes term t occurs n document d and df s a measure of whether the term s common or rare across all documents. The weghts assgned to the tokens s calculated usng Eq.(1). Tokenza ton and stopword removal Stemmng usng Porter s Algorthm Extract Color feature Calculate color moments w log(1 tf ) log1( N / df ) (1) t, d t, d t where: N represents the total number of documents. tf t,d s the term-frequency df t s the document frequency of term t.e. the number of documents that contan t. Assgn weghts usng tf-df weghtng s Text Feature vector Text matchng usng cosne smlarty Resultant Text Vector Combne vectors Dsplay mages Image Feature Vector Image color matchng usng dstance between color moments Resultant Image Vector Flter results usng Color hstogram Fnal Refned Result. Text retreval and rankng The nverted ndex created n the Secton. s used to fnd the lmted set of documents that contan at least one of the query words. The documents and the query are represented as vectors. The Key dea behnd ths representaton s we can easly rank documents accordng to ther proxmty (smlarty of vectors) to the query n the vector space. We have used angles rather than dstance for smlarty between vectors because the Eucldean dstance between the two documents can be qute large for vectors of dfferent lengths. But the angle between two most smlar documents s. Hence the cosne smlarty scores are calculated between a query and each document usng the Eq. () [4] q V q d d q d 1 cos( q, d) q V V q d d q d () 1 1 Where: q s the tf-df weght of term n the query d s the tf-df weght of term n the document cos(q,d) s the cosne smlarty of q and d or, equvalently, the cosne of the angle between q and d. The documents are ranked n decreasng order of smlarty scores snce cosne s monotoncally decreasng functon. Value of 1 represents the perfect match and means the worst match. Fg. 1 Flowchart for the proposed system. Indexng the text descrptons An nverted ndex s bult wth an entry for each token n the vocabulary obtaned n prevous step. The data structure used for the purpose s hash tables. An nverted. Image feature Extracton and Feature Vector Creaton One of the most mportant features a human uses s color. Color s a property that depends on the reflecton of lght to the eye and the processng of that nformaton n the

www.ijcsi.org 51 bran [5]. Usually colors are defned n three dmensonal color spaces. These could ether be RGB (Red, Green, and Blue), HSV (Hue, Saturaton, and Value) or HSB (Hue, Saturaton, and Brghtness). RGB s the smpler color space n terms of computaton but t not used because the values changes wth llumnaton change n the mage. The last two are dependent on the human percepton of hue, saturaton, and brghtness. The color space model used n the proposed system s HSV snce t s llumnaton and camera- drecton nvarant. Moreover t s more ntutve way of descrbng colors []. Color moments have proven [6] to be a successful technque for ndexng mages based on color. Ther correctness outweghs [7] classc color ndexng technques such as cumulatve color ndexng. The three color moments namely the frst order (mean), the second (varance) and the thrd order (skewness) values are calculated for each of these channels (Hue, Saturaton, and Value). The three color moments can then be defned as: 1 st Moment Mean: E 1 nd Moment Standard Devaton: 1 n n j 1 rd Moment Skewness: S n p j n j 1 ( p 1 n n j j 1 ( p E) j Where : p j = value of th color channel for j th pxel of mage n Number of pxels n the mage E mean of mage for th color channel σ standard devaton of mage for th color channel S skewness of mage for th color channel color channel ndex from 1- (.e. 1 = H, = S, = V) Hence the feature vector for the mage contans 9 values n the form of X matrx of the followng format:- E ) () (4) (5) The feature vector s created for the query mage as well as for the database mages..4 Image smlarty matchng and rankng In ths proposed technque, the mage matchng dstance between the query mage (Q) and the database mage (I) s defned as the sum of the weghted dfferences between the moments vectors of the two mages. d( Q, I) w j1 E jq E ji w j jq ji w j j 1 Formally ths s: Where: Q, I: are the two mage dstrbutons beng compared j: s the current channel ndex (.e. 1 = H, = S, = V) E jq and E ji : are the frst moments of the two mages σ jq and σ ji : are the second moments of the two mages S jq and S ji : are the thrd moments of the two mages w j : are the weghts for each moment (descrbed below) (6) w j values are user specfed weghts. Dependng on the applcaton, and the datasets used these values can be tuned so that dfferent preferences are gven to dfferent features of an mage. Snce the database we are usng s a general purpose database of colored mages, We use the followng weght matrx to weght hue slghtly hgher than saturaton and value. The reason s Hue s a term whch descrbes a dmenson of color we readly experence when we look at color. The dstance d between the query mage and the database mage s calculated usng the above metrcs. If we compare the two d values d(query,db1) < d(query,db) We say that 'db1' s more smlar to the 'query Image' than 'db's s to the 'query Image', based on color moments. The values obtaned are n the range of and 1. The value represents the perfect match whereas 1 represents the worst match. The mages are ranked n decreasng order of ther relevance. S jq S ji.5 Combned Score Where: E 11 E 1 E 1 represents Mean value for HSV components σ 11 σ 1 σ 1 represents Standard devaton value for HSV S 11 S 1 S 1 represents Skewness value for HSV After calculatng the ndvdual smlarty score for text and color we need to combne the smlarty scores for text matchng and mage matchng to provde a fnal smlarty score for matchng a query wth a document/mage. We cannot smply add the weghts of each mage n the two

www.ijcsi.org 5 lsts to get the combned weght. The problem n combnng the two weght lst s that two totally use dfferent weghtng schemes as shown below. For Image: (perfect) 1(worst) For Text: 1 (perfect) (worst) So the soluton s to reverse the weghts of the one of the lst, so that the weghtng schemes are n match wth each other. Moreover 5% weghtage s gven to both the schemes. The smlartes are combned usng Eq. (7) Combned Score =.5*(mage score) +.5*(1 text score) (7).6 Flter Results Conventonal color hstogram (CCH) s the approach more frequently adopted for CBIR systems [8]. The nterestng aspect of the CCH s ts smplcty and ease of calculaton [9]. A color hstogram descrbes the frequency of colors n mages. It won t change wth the varatons of pctures geometry, so t s a wdely used feature for mage ndexng although t has some shortcomngs []. The dstance metrc used for matchng the CCH of mages s quadratc form dstance (QFD). QFD can lead to perceptually more desrable results than Eucldean dstance and hstogram ntersecton method as t consders the cross smlarty between colors. Suppose f a hstogram has n number of bns, then t requres n x n calculatons for cross smlarty matchng. So the complexty s O(n ).Hence the method becomes computatonally very expensve. So to narrow down the search results color moments were used as 1st pass due to ther compactness and ease of calculatons but may result n some unwanted results. A larger set of mages are obtaned from flterng based on color moments. Then CCH and QFD metrc s used to flter out results from the set obtaned. Ths s wll reduce the computatonal complexty, gve vsually better smlar results and also removes unwanted results [9]. The CCH s calculated for the set of results obtaned n the prevous step and for the query mage. QFD s used to fnd out the dstance between the CCH of the query and the database mages. The formula for QFD s gven by Eq. (8): D QI Where: T ( H Q H I ) * A*( H Q H I ) (8) A = [a j ] s a smlarty matrx, and aj denotes the smlarty between bn and j. H Q = hstogram bns of the Query Image Q H I = Hstogram bns of the database Image I Usng the QFD metrc the smlarty dstance s calculated and agan a refned set of results are shown to the user.. Expermental Results The proposed algorthm s tested by usng WkpedaMM Image Collecton. Ths collecton contans of 151,519 mages that cover dverse topcs of nterest. Each mage s assocated wth user-generated alphanumerc, unstructured metadata n Englsh. These metadata usually contan a bref capton or descrpton of the mage, the Wkpeda user who uploaded the mage, and the copyrght nformaton. These descrptons are hghly heterogeneous and of varyng length. For the Expermental purpose we have used 15 mages and ther metadata. All the mages are n RGB color space and jpeg/png format. Metadata s avalable n the form of XML fles. Matlab 7. and Vb.net s used for developng the proposed system. Out of 15 mages there are 5 map mages, 5 con mages, 5 car and more. We have conducted test for categores map and car. The performance s measured usng two retreval statstcs precson and recall. The values are calculated usng Eq. (9) and Eq. (1) for three type of queres text-based, mage-based and combned. Precson = Relevant mages retreved mages retreved Recall = Relevant mages retreved relevant mages n database (9) (1) Table 1 shows the performance of map category. From the statstcs n Table 1 we can observe that for text-based query 4 mages are retreved whereas database has only 5 relevant mages correspondng to the query. Ths mples that text-based retreval may lead to garbage collecton. Thus combnng t wth content- based retreval wll help removng those garbage results. The expermental results show that combned technque performs better than the ndvduals and s helpful n retrevng more meanngful results thus reducng the semantc gap. The expermental results are shown usng the graph n Fg. whch clearly shows that statstcs are better for combned approach.

www.ijcsi.org 5 Relevant mages n DB mages Retreved relevant mages retreved Precson Recall Text 5 4 78.57 94.8 Image 5 6 78.8 74.8 Total non-relevant Images n database non- relevant mages retreved Table (b): fall-out after fltraton for map Fall out Text 115 9 7.8 Image 115. Hybrd 115. Hybrd 5 8 5 9.1 1 n % Table 1: Precson and Recall values of map category P/R for map 1 1 94.8 1 9.1 78.57 78.8 74.8 8 6 Precson Recall 4 Fall-out n % 9 8 7 6 5 4 1 7.8 5.8.71 Text Image Hybrd Before Flteraton After Flteraton Text Image Combned Fg. Bar Graph showng Precson and Recall values for map. The second aspect of the proposed system s to mprove the performance by removng unwanted mages and dsplay relevant results. Table shows that fall-out before and after fltraton. Fall-out s the probablty that a nonrelevant document s retreved by the query. It s gven by Eq. (11) Fall-Out = Non-Relevant mages retreved non-relevant mages n Database (11) Table (a) shows the fall-out before the fltraton process. Table (b) shows the mprovement n the fall-out for CBIR and combned technque after flterng usng color hstogram. The expermental results are shown usng the graph n Fg. whch clearly shows that performance mproves. Total non-relevant Images n database non- relevant mages retreved Fall out Text 115 9 7.8 Image 115 7 5.8 Hybrd 115.71 Fg. Graph showng Fall-out Values for map Smlarly the performance of the proposed system s measured by testng for another category of mages car. Table shows the precson and recall values for textquery car and mage of a car. From the Table we can observe that for text-based retreval 1 mages are retreved however database contans 5 such mages. Ths mples that text-retreval may result n loss of some relevant mages whch are not annotated properly. So combnng two approaches helps n retrevng relevant mages. The expermental results are shown usng the graph n Fg.4 whch clearly shows that statstcs are better for combned approach. Relevant mages n DB mages Retreved relevant mages retreved Precson Recall Text 5 1 1 76.9 8.57 Image 5 14 1 71.4 8.57 Hybrd 5 18 9. 51.4 Table : Precson and Recall values for car category. Table (a): Fall-out before fltraton for map

www.ijcsi.org 54 1 P/R for car 9 Now after flterng process, the results dsplayed are much more vsually smlar to the query mage as shown n Fg. 6(b). 8 76.9 71.4 n % 6 4 8.57 8.57 51.4 Precson Recall Text Image Hybrd Fg. 4 Bar Graph showng Precson and Recall values for car. Table 4(a) shows the fall-out of query mage car before the fltraton process. Table 4(b) shows that the fall-out s much less after fltraton. The expermental results are shown usng the graph n Fg.5 whch clearly shows that performance mproves. Fg. 6(a): Results showng Images before fltraton Total nonrelevant Images n database non- relevant mages retreved Fall out Text 115.6 Image 115 4.47 Hybrd 115 1.7 Table 4(a): Fall-out before fltraton for car Total nonrelevant Images n database non- relevant mages retreved Fall out Text 115.6 Fall-out n % Image 115 1.86 Hybrd 115. 4 1 Table 4(b): Fall-out after fltraton for car.6.47.86 1.7 Text Image Hybrd Fg. 5 Graph showng Fall-out Values for car Fg. 6(a) shows the results obtaned wthout flterng correspondng to the query mage provded by the user. Before Flteraton After Flteraton 4. Conclusons Fg. 6(b) Results showng Images after fltraton. In the proposed system tf-df weghtng are used for feature vector generaton for text and cosne smlarty s used for text-matchng. The feature vectors for mages are extracted n terms of three moments: 1 st moment (mean), nd moment (varance), rd moment (skewness).the color space model used s HSV for feature vector extracton. The performance s evaluated n terms of precson, recall and fall-out values. The statstcs calculated shows that the performance of combned approach s better than that of text-based or mage-based. The system agan flters the results so obtaned usng CCH feature and Quadratc form dstance to show vsually better results. The screenshots and the fall-out value obtaned shows that the performance of the system s mproved wth decrease n the computatonal complexty.

www.ijcsi.org 55 References [1] A.W.M. Smeulders, M. Worrng, S. Santn, A. Gupta, and R. Jan, Content-Based Image Retreval at the End of the Early Years, IEEE Transactons On Pattern Analyss And Machne Intellgence, Vol.,, pp. 149-18. [] N. Sngha, and Prof. S.K. Shandlya, A Survey On: Content Based Image Retreval Systems, Internatonal Journal of Computer Applcatons, Vol. 4 No., July 1, pp. -6. [] F. Long, H.J Zhang, and Feng D., "Fundamentals of content-based mage retreval" n Multmeda Informaton Retreval and Management- Technologcal Fundamentals and Applcatons, Sprnger,. [4] V. N. Gudvada, and V. V. Raghavan, Content-Based Image Retreval systems, IEEE Computers, vol. 8, no. 9, 1995, pp. 18-. [5] R. Datta, D. Josh, J. L, and J. Z. Wang, Image Retreval: Ideas, Influences, and Trends of the New Age Addendum, ACM Computng Surveys, vol. 4, no., artcle 5, 8, pp. 1-6. [6] H. Zhang, M. Jang, and X. Zhang, Explorng mage context for semantc understandng and retreval, n Internatonal Conference on Computatonal Intellgence and Software Engneerng, 9, pp. 1 4. [7] A. Jakarta, Apache Lucene - a hgh-performance, fullfeatured text search engne lbrary, http://lucene.apache.org/. [8] N. S. Chang and K. S. Fu, A Relatonal Database System for Images, Techncal Report TR-EE 79-8, Purdue Unversty, May 1979. [9] N. S. Chang and K. S. Fu, Query-by pctoral-example, IEEE Trans. on Software Engneerng SE-6(6), 198, pp. 519-54. [1] T. Joseph, A.F. Cardenas, "PICQUERY: A Hgh Level Query Language for Pctoral Database Management," IEEE Transactons on Software Engneerng, vol. 14, no. 5, 1988, pp. 6-68. [11] R. Jan, and P. Snha, Content wthout context s meanngless, Proceedngs of the nternatonal conference on Multmeda,ACM, 1, pp. 159-168. [1] Y. Alemu, J. Koh, M. Ikram, D. Km. Image Retreval n Multmeda Databases: A Survey, n Ffth Internatonal Conference on Intellgent Informaton Hdng and Multmeda Sgnal Processng, 9, pp. 681-689 [1] T. Pavlds, Lmtatons of cbr, n ICPR, 8 [14] J. Xu, B. Xu, S. Men, Feature-based Smlarty Retreval n Content-based Image Retreval, n Web Informaton Systems and Applcatons Conference,IEEE, 1, pp. 15 19. [15] R.S. Choras, Image Feature Extracton Technques and Ther Applcatons for CBIR and Bometrcs Systems, Internatonal Journal Of Bology And Bomedcal Engneerng, 7 pp. 6 16. [16] B.Szanto, P. Pozsegovcs, Z.Vamossy, Sz.Sergyan, Sketch4Match Content Based Image Retreval System Usng Sketches, n 9th IEEE Internatonal Symposum on Appled Machne Intellgence and Informatcs, IEEE January 11, pp. 18-188. [17] C. Hartvedt, Usng context to understand user ntentons n Image retreval, n second IEEE Internatonal conferences on advances n multmeda, 1, pp. 1-1. [18] N. Zhang and Y. Song, An Image Indexng and Searchng System Based Both on Keyword and Content, n Proceedngs of the 4th Internatonal Conference on Intellgent Computng (ICIC8), ser. LNCS 56. Sprnger-Verlag Berln Hedelberg, Sep. 8, pp. 159 166. [19] D. McG. Squre, W. Müller, H. Müller, and J. Rak, Content-Based Query of Image Databases, Inspratons From Text Retreval: Inverted Fles, Frequency-Based Weghts and Relevance Feedback, PATTERN RECOGNITION LETTERS, 1999. [] J. Abbas, S. Qadr, M. Idrees, S. Awan, and N. A. Khan, Frame Work For Content Based Image Retreval (Textual Based) System, Journal of Amercan Scence, Vol. 6(9), 1, pp. 74 77. [1] H. Jn, A. Sun, R. Zheng, R. He, Q. Zhang, Y. Sh, and W. Yang. Content and Semantc Context Based Image Retreval for Medcal Image Grd, n Proceedngs of the 8th IEEE/ACM Internatonal Conference on Grd Computng, 7, pp. 15-11 [] C.J. van Rjsbergen, S.E. Robertson and M.F. Porter, New models n probablstc nformaton retreval, Report, no. 5587, Brtsh Lbrary, London, 198. [] R. Stata, K. Bharat, F. Maghoul, The Term Vector Database: fast access to ndexng terms for Web pages, Computer Networks, Volume, Issues 1 6, June, pp. 47-55. [4] G. Salton, A. Wong, C. S. Yang, A vector space model for automatc ndexng, Communcatons of the ACM, v.18 n.11, Nov. 1975, pp.61-6. [5] S. Wang, A Robust CBIR Approach Usng Local Color Hstograms, Tech. Rep. TR 1-1, Department of Computer Scence, Unversty of Alberta, Edmonton, Alberta, Canada, 1. [6] S. R Kodtuwakku, and S. Selvarajah, Comparson of Color Features for Image Retreval, Indan Journal of Computer Scence and Engneerng, Vol. 1 No., 1, pp.7-11. [7] M. Strcker and M. Orengo. Smlarty of color mages, SPIE Conference on Storage and Retreval for Image and Vdeo Databases III, vol. 4, Feb. 1995, pp. 81-9. [8] Zur Erlangung des Doktorgradesder Fakult, Angewandte Wssenschaften, Dssertaton, Feature Hstograms for Content-Based Image Retreval,. [9] C. Patl, V. Dalal, Content Based Image Retreval Usng Combned Features, n Proceedngs of the Internatonal Conference & Workshop on Emergng Trends n Technology, ACM, 11, pp. 1-15. Frst Author Ndh Goel graduated n Physcs from Unversty of Delh n 6, receved MCA degree from GGS Indraprastha Unversty, Delh n 9 and s currently pursung Ph.D. n Computer Scence from the Department of Computer Scence, Unversty of Delh.She was lecturer n Delh Insttute of Advanced studes, GGS Indraprastha Unversty. Second Author Dr. Prt Sehgal receved her Ph.D. n Computer Scence from the Department of Computer Scence, Unversty of Delh, Inda n 6 and her M. Sc. n Computer Scence from DAVV, Indore, Inda n 1994. She s an Assocate Professor n the Department of Computer Scence, Keshav Mahavdyalaya, Unversty of Delh. She has about 17 years of teachng and research experence and has publshed papers n Natonal/Internatonal Journals/Conferences. Dr. Sehgal has been

www.ijcsi.org 56 a member of the program commttee of the CGIV Internatonal Conference and s a lfe member of Computer Socety of Inda. Her research nterests nclude Computer Graphcs, Image Processng, Bometrcs, Vsualzaton and Image Retreval.