Content Based Image Retrieval system with a combination of Rough Set and Support Vector Machine

Similar documents
Wavelet Based Image Retrieval Method

Naïve Bayes for text classification

Content Based Image Retrieval Using Color and Texture Feature with Distance Matrices

Holistic Correlation of Color Models, Color Features and Distance Metrics on Content-Based Image Retrieval

Robot Learning. There are generally three types of robot learning: Learning from data. Learning by demonstration. Reinforcement learning

A Feature Selection Method to Handle Imbalanced Data in Text Classification

Robustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification

A Novel Image Retrieval Method Using Segmentation and Color Moments

A Novel Approach For CBIR Using Multi-feature Extraction And SVM Classification

Feature Data Optimization with LVQ Technique in Semantic Image Annotation

An efficient face recognition algorithm based on multi-kernel regularization learning

A Comparative Study of SVM Kernel Functions Based on Polynomial Coefficients and V-Transform Coefficients

EFFICIENT ATTRIBUTE REDUCTION ALGORITHM

A Study on the Effect of Codebook and CodeVector Size on Image Retrieval Using Vector Quantization

Nearest Clustering Algorithm for Satellite Image Classification in Remote Sensing Applications

Keyword Extraction by KNN considering Similarity among Features

SVM Based Classification Technique for Color Image Retrieval

SPATIAL BIAS CORRECTION BASED ON GAUSSIAN KERNEL FUZZY C MEANS IN CLUSTERING

CS5670: Computer Vision

Automatic Image Annotation by Classification Using Mpeg-7 Features

MIXED VARIABLE ANT COLONY OPTIMIZATION TECHNIQUE FOR FEATURE SUBSET SELECTION AND MODEL SELECTION

CONTENT BASED IMAGE RETRIEVAL SYSTEM USING IMAGE CLASSIFICATION

A Bayesian Approach to Hybrid Image Retrieval

2. LITERATURE REVIEW

Performance evaluation of CBIR techniques

An Efficient Methodology for Image Rich Information Retrieval

Automatic New Topic Identification in Search Engine Transaction Log Using Goal Programming

A Retrieval Method for Human Mocap Data Based on Biomimetic Pattern Recognition

Open Access Self-Growing RBF Neural Network Approach for Semantic Image Retrieval

Lecture 9: Support Vector Machines

An Edge-Based Algorithm for Spatial Query Processing in Real-Life Road Networks

Real-Time Image Classification Using LBP and Ensembles of ELM

Content-based Image Retrieval with Color and Texture Features in Neutrosophic Domain

Short Run length Descriptor for Image Retrieval

Collaborative Rough Clustering

3-D MRI Brain Scan Classification Using A Point Series Based Representation

Image Classification through Dynamic Hyper Graph Learning

SCENARIO BASED ADAPTIVE PREPROCESSING FOR STREAM DATA USING SVM CLASSIFIER

A Modular k-nearest Neighbor Classification Method for Massively Parallel Text Categorization

Relevance Feedback for Content-Based Image Retrieval Using Support Vector Machines and Feature Selection

IMAGE RETRIEVAL SYSTEM: BASED ON USER REQUIREMENT AND INFERRING ANALYSIS TROUGH FEEDBACK

Ranking Error-Correcting Output Codes for Class Retrieval

A REVIEW ON IMAGE RETRIEVAL USING HYPERGRAPH

Optimization Model of K-Means Clustering Using Artificial Neural Networks to Handle Class Imbalance Problem

The Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem

A Feature Selection Methodology for Steganalysis

Feature and Search Space Reduction for Label-Dependent Multi-label Classification

The Comparative Study of Machine Learning Algorithms in Text Data Classification*

Mass Classification Method in Mammogram Using Fuzzy K-Nearest Neighbour Equality

A novel supervised learning algorithm and its use for Spam Detection in Social Bookmarking Systems

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data

Bi-Level Classification of Color Indexed Image Histograms for Content Based Image Retrieval

Minimal Test Cost Feature Selection with Positive Region Constraint

Automatic Categorization of Image Regions using Dominant Color based Vector Quantization

A Comparison of Global and Local Probabilistic Approximations in Mining Data with Many Missing Attribute Values

A modified and fast Perceptron learning rule and its use for Tag Recommendations in Social Bookmarking Systems

Classification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska

Image Processing (IP)

Data mining with Support Vector Machine

Content Based Classification of Audio Using MPEG-7 Features

Classification by Support Vector Machines

Content-based Medical Image Annotation and Retrieval using Perceptual Hashing Algorithm

Video Inter-frame Forgery Identification Based on Optical Flow Consistency

KBSVM: KMeans-based SVM for Business Intelligence

CONTENT BASED IMAGE RETRIEVAL (CBIR) USING MULTIPLE FEATURES FOR TEXTILE IMAGES BY USING SVM CLASSIFIER

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review

Video annotation based on adaptive annular spatial partition scheme

Using Association Rules for Better Treatment of Missing Values

Time Series Clustering Ensemble Algorithm Based on Locality Preserving Projection

An Efficient Content-Based Image Retrieval System Based On Dominant Color Using a Clustered Database

A Novel Feature Selection Framework for Automatic Web Page Classification

Get High Precision in Content-Based Image Retrieval using Combination of Color, Texture and Shape Features

Comparison of CBIR Techniques using DCT and FFT for Feature Vector Generation

String Vector based KNN for Text Categorization

Patent Classification Using Ontology-Based Patent Network Analysis

Design of student information system based on association algorithm and data mining technology. CaiYan, ChenHua

An Immune Concentration Based Virus Detection Approach Using Particle Swarm Optimization

Face Recognition Using Vector Quantization Histogram and Support Vector Machine Classifier Rong-sheng LI, Fei-fei LEE *, Yan YAN and Qiu CHEN

Simulation of Zhang Suen Algorithm using Feed- Forward Neural Networks

Support Vector Regression for Software Reliability Growth Modeling and Prediction

FEATURE EXTRACTION TECHNIQUES USING SUPPORT VECTOR MACHINES IN DISEASE PREDICTION

A study of classification algorithms using Rapidminer

A Rough Set Approach for Generation and Validation of Rules for Missing Attribute Values of a Data Set

Feature weighting classification algorithm in the application of text data processing research

Evolving SQL Queries for Data Mining

Classification by Support Vector Machines

Fine Classification of Unconstrained Handwritten Persian/Arabic Numerals by Removing Confusion amongst Similar Classes

Rough Set Approaches to Rule Induction from Incomplete Data

I. INTRODUCTION. Image Acquisition. Denoising in Wavelet Domain. Enhancement. Binarization. Thinning. Feature Extraction. Matching

Support Vector Machines

SVM Classification in Multiclass Letter Recognition System

Individual feature selection in each One-versus-One classifier improves multi-class SVM performance

Comparative analysis of classifier algorithm in data mining Aikjot Kaur Narula#, Dr.Raman Maini*

Applied Statistics for Neuroscientists Part IIa: Machine Learning

2. Department of Electronic Engineering and Computer Science, Case Western Reserve University

Leave-One-Out Support Vector Machines

Image retrieval based on bag of images

Road Detection Using Support Vector Machine based on Online Learning and Evaluation

Rolling element bearings fault diagnosis based on CEEMD and SVM

A Method of Hyper-sphere Cover in Multidimensional Space for Human Mocap Data Retrieval

Transcription:

Shahabi Lotfabadi, M., Shiratuddin, M.F. and Wong, K.W. (2013) Content Based Image Retrieval system with a combination of rough set and support vector machine. In: 9th Annual International Joint Conferences on Computer, Information, Systems Sciences, & Engineering (CISSE), 12-14 December Content Based Image Retrieval system with a combination of Rough Set and Support Vector Machine Maryam Shahabi lotfabadi, Mohd Fairuz Shiratuddin, Kok Wai Wong Abstract In this paper, a classifier based on a combination of Rough Set and 1-v-1 (one-versus-one) Support Vector Machine for Content Based Image Retrieval system is presented. Some problems of 1-v-1 Support Vector Machine can be reduced using Rough Set. With Rough Set, a 1-v-1 Support Vector Machine can provide good results when dealing with incomplete and uncertain data and features. In addition, boundary region in Rough Set can reduce the error rate. Storage requirements are reduced when compared to the conventional 1-v-1 Support Vector Machine. This classifier has better semantic interpretation of the classification process. We compare our Content Based Image Retrieval system with other image retrieval systems that uses neural network, K-nearest neighbour and Support Vector Machine as the classifier in their methodology. Experiments are carried out using a standard Corel dataset to test the accuracy and robustness of the proposed system. The experiment results show the proposed method can retrieve images more efficiently than other methods in comparison. Index Terms Classifier, Content Based Image Retrieval system, Rough Set, Support Vector Machine I I. INTRODUCTION mage Classification is one of the most important aspects in Content Based Image Retrieval (CBIR) systems [1, 2], therefore using the appropriate classifier for CBIR systems is critical. Many past papers used different classifiers for their CBIR systems [3-5], however some problems still remain. Some of the drawback of current classifiers is the lengthy training time [6], high storage requirements [7], did not achieve the required semantic results [7] and cannot deal with incomplete and uncertain data and features [8]. In this paper, we used a combination of Rough Set and 1-v-1 Support Vector Machine (SVM) as the classifier for our CBIR system (which was proposed in [9]). In the 1-v- 1 SVM, one SVM is constructed for each pair of classes [10]. The reasons why the proposed Rough Set with SVM as a classifier has better result compared to the conventional SVM classifier due to the fact that conventional SVM has high storage requirements and lack of semantic interpretation of classification process [10, 11]. Rough set can reduce the storage requirements by using upper and lower approximations. This means that the rules should store in conventional 1-v-1 SVM, however this amount is reduced to for classifier include the Rough Set and 1-v-1 SVM [11]. In addition, combination of Rough Set and 1-v-1 SVM can provide a better semantic interpretation of the classification process using properties of the Rough Set boundary region [9]. This paper is organised as follows: section 2 presents rough set to 1-v-1 support vector machine classifier. Experiment setup is given in section 3. In section 4, we show experimental results. Conclusion is made in section 5. II. ROUGH SET METHOD TO SUPPORT VECTOR MACHINE 1-V-1 MULTI CLASSIFIERS This section describes a rough set method to SVM 1-v- 1 multi classifiers as proposed in [9]. First, a nonlinear separable feature space is transformed to a linear separable feature space using a Radial Basis Function Kernel (RBF Kernel). The reason for choosing this kernel is RBF kernel has better results in CBIR systems [12]. The perfect situation is that the SVM can find the hyperplane by maximizing the margin between two classes and no example are in the margin i.e. after transforming nonlinear feature space to linear feature space [13, 14] (see Fig. 1). Maryam Shahabi Lotfabadi. Author is a PhD student at the School of Engineering & Information Technology, Murdoch University, 90 South Street, Murdoch, Western Australia 6150, Australia (e-mail: m.shahabi@murdoch.edu.au) Mohd Fairuz Shiratuddin. Author is a Senior Lecturer at the School of Engineering & Information Technology, Murdoch University, 90 South Street, Murdoch, Western Australia 6150, Australia (e-mail: f.shiratuddin@murdoch.edu.au) Kok Wai Wong. Author is an Associate Professor at the School of Engineering & Information Technology, Murdoch University, 90 South Street, Murdoch, Western Australia 6150, Australia (e-mail: k.wong@murdoch.edu.au) Fig. 1. Maximizing the margin between two classes

However when there are some examples between the margin, applying a method which can deal with vague and uncertain spaces like Rough Set is essential. The margin can be used as the Rough Set boundary region. Using the formulas shown below, Rough Set is applied to SVM, and and correspond to the boundaries of the margin in Fig. 3,4,5 (red lines). is defined as follows:, for all, and there exists at least one training example such that and. all is defined as follows:, for, and there exist at least one training example such that and. The above variables are defined as follows: Assume is an input vector in the input space and is the output in. Training set used for supervised classification is. is inner product and and are components of two vector and. Fig. 3. A rough set method to SVM 1-v-1 classification for classes Flower and Elephant According to R1, R2 and R3 rules, a Rough Set based SVM binary classifier can be defined when: [R1] If, classification of x is positive (+1). [R2] If, classification of x is negative (-1). [R3] Otherwise, classification of is uncertain. In the SVM 1-v-1 multi classifier, one binary SVM is constructed for each pair,, of classes. According to the rules of R1, R2 and R3, three equivalence classes can be defined for each pair., and are the set of (or region) that follows the rule R1, R2 and R3 respectively. Lower and upper approximations and boundary region for class and are summarised in Table 1. Classification problem with three classes i.e. Flower, Elephant, and African people is shown in Fig. 2. Fig. 3, Fig. 4 and Fig. 5 show the Rough Set method to SVM 1- v-1 classification for classes Flower and Elephant, Flower and African people and Elephant and African people respectively. Fig. 4. A rough set method to SVM 1-v-1 classification for classes Flower and African people Fig. 5. A rough set method to SVM 1-v-1 classification for classes Elephant and African people Fig. 2. A classification problem including three classes Flower, Elephant and African people III. EXPERIMENT SETUP Our proposed CBIR system in this paper has training and testing phases. For each semantic group, two rules are extracted as we used 10 semantic image groups in our experimental results so that 20 rules are extracted in the training phase ( ). In the testing phase, the user feed

a query image to CBIR system. According to rules from the training phase, if the query image is related to the positive region, images in that positive region are shown to user as the retrieval results. However, if the query image is related to the boundary region using a threshold, the images which are most likely similar to query image is then shown to user. From many experimental results, a threshold was defined for each of the semantic image groups. For each semantic group, if distance between image in boundary region and images in the positive region is less than the threshold, the image in the boundary region will be categorized in the positive region group. IV. EXPERIMENT RESULTS In this section, the results that compare the three retrieval systems with the proposed retrieval system are presented. These three retrieval systems used SVM [7], neural network (NN) [8] and K-nearest neighbour (KNN) [6] as the classifier in their methodology. To investigate the function of the image retrieval system based on the above mentioned methods, we used the COREL image database containing one thousand images. In this database, images are classified into ten semantic groups. The groups are African people, beach, bus, flower, mountains, elephant, horse, food, dinosaur, and building. We expressed the results of each group with a number e.g. number 1 represents African people; number 5 represents mountains, and etc. A. Precision-Recall Graph Recall equals to the number of the related retrieval images to the number of the related images available in the images database. The precision equals to the number of the related retrieval images to all of the retrieval images [2]. Fig. 6 shows the precision-recall graph for ten semantic groups that is used for measuring the efficiency of the proposed retrieval system. In the experiment results, the proposed retrieval system is shown as RSSVM. From the graph, we observed that the proposed retrieval system achieved better results than the other three systems. The reason for this is the proposed method is the ability to handle the uncertain boundaries better, thus able to classify those in the region more accurately. B. The Investigation of the Retrieval Precision To investigate the total precision of the above mentioned retrieval systems, 1000 images are fed into the system as the queried images. The average of the retrieval precision is calculated for each class. Fig. 7 shows the results using different classifiers. As anticipated, the results are better using the proposed system. The average of the retrieval precision is 55.9%, 59.4% and 68.1% for SVM, NN and KNN respectively. It increases to 73.8% for RSSVM. Fig. 7. Precision of retrieval The reasons behind the superiority of RSSVM are the: 1) Overlapped region in the classification problem can be described using boundary region in Rough Set more accurately. 2) Optimal separating hyper-plane by maximizing the margin is constructed using SVM effectively. 3) Perfect generation ability is the SVM s properties, however cannot deal with imprecise or incomplete data. 4) Most important properties of rough set is that it can deal with vague and incomplete data efficiently. In addition, the RSSVM classifier has some advantages compared to the conventional SVM. One of the advantages is RSSVM reduced storage requirements. RSSVM requires to store just rules for each class (one rule for lower approximation and one rule for upper approximation), compared to conventional SVM that needs to store rules [10]. Another advantage of RSSVM is that it has better sematic interpretation of the classification process compared to the conventional SVM [11]. C. The Image Comparison of the Retrieval Systems In the final test, we presented the retrieval results for the queried flower image (Fig. 8). The first, second, and up to the fourth row in Fig. 9 is related to RSSVM, KNN, NN and SVM respectively. Referring to Fig. 9, the retrieval system with the RSSVM classifier represented a more related output images to the user. The first left image in Fig. 9 matched closely to the queried image. Fig. 6. Precision- recall graph

Fig. 8. Query image The reason why the proposed method has better results than those in other retrieval systems is that the rules extracted from the RSSVM classifier are semantic and can better classifies images. Consequently, the RSSVM classifier can show more relevant images to user. V. CONCLUSION This paper proposed a classifier based on a combination of Rough Set and a 1-v-1 Support Vector Machine (SVM) in a Content Based Image Retrieval (CBIR) system. The proposed image retrieval system is compared with other image retrieval systems which used other classifiers such as neural network, K-nearest neighbour and Support Vector Machine in their systems. Based on the experimental results, it can be concluded that CBIR system with Rough Set and SVM classifier has good results and better semantic interpretation of the classification process compared to conventional SVM. In addition, this classifier reduced the storage requirements because it only requires to store a rules. In this paper we focused on a 1-v-1 (one-versus-one) support vector machine for future it is good idea will do some research on 1-v-r (one-versus-rest) support vector machine, and evaluated results using combination of rough set and 1-v-r support vector machine as a classifier in content based image retrieval systems. Fig. 9. Retrieved images according to: first raw- RSSVM, second raw- KNN, third raw- NN, fourth raw- SVM. Table 1. Lower and upper approximations and boundary region for class and Class Class Lower approximation Upper approximation Over all lower approximation for class Over all Boundary region for class Over all upper approximation for class Some rules extract from above formula ( ) ( ) ( ) ( ) ( ) ( )

REFERENCES [1] Ritendra Datta, Dhiraj Joshi, Jia Li, and J. Z. Wang, "Image Retrieval: Ideas, Influences, and Trends of the New Age," ACM Computing Surveys, vol. 40, pp. 1-60, 2008. [2] T. Dharani and I. L. Aroquiaraj, "A survey on content based image retrieval," in Pattern Recognition, Informatics and Medical Engineering (PRIME), 2013 International Conference on, 2013, pp. 485-490. [3] Remco C. Veltkamp and M. Tanase, "Content-Based Image Retrieval Systems: A Survey," UU-CS2002. [4] Y. Xun, H. Xian-Sheng, W. Meng, G. J. Qi, and W. Xiu- Qing, "A Novel Multiple Instance Learning Approach for Image Retrieval Based on Adaboost Feature Selection," in Multimedia and Expo, 2007 IEEE International Conference on, 2007, pp. 1491-1494. [5] E. Yildizer, A. M. Balci, M. Hassan, and R. Alhajj, "Efficient content-based image retrieval using Multiple Support Vector Machines Ensemble," Expert Systems with Applications, vol. 39, pp. 2385-2396, 2012. [6] Z.-M. Lu, H. Burkhardt, and S. Boehmer, "Fast Content- Based Image Retrieval Based on Equal-Average K-Nearest- Neighbor Search Schemes," in Advances in Multimedia Information Processing - PCM 2006. vol. 4261, Y. Zhuang, S.-Q. Yang, Y. Rui, and Q. He, Eds., ed: Springer Berlin Heidelberg, 2006, pp. 167-174. [7] Xiaohong Yu and H. Liu, "Image Semantic Classification Using SVM In Image Retrieval " Proceedings of the Second Symposium International Computer Science and Computational Technology(ISCSCT 09), pp. 458-461, 2009. [8] B. V. Jyothi and K. Eswaran, "Comparative Study of Neural Networks for Image Retrieval," in Intelligent Systems, Modelling and Simulation (ISMS), 2010 International Conference on, 2010, pp. 199-203. [9] P. Lingras and C. Butz, "Reducing the Storage Requirements of 1-v-1 Support Vector Machine Multi-classifiers," in Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. vol. 3642, D. Ślęzak, J. Yao, J. Peters, W. Ziarko, and X. Hu, Eds., ed: Springer Berlin Heidelberg, 2005, pp. 166-173. [10] X. Zhang, J. Zhou, J. Guo, Q. Zou, and Z. Huang, "Vibrant fault diagnosis for hydroelectric generator units with a new combination of rough sets and support vector machine," Expert Systems with Applications, vol. 39, pp. 2621-2628, 2012. [11] P. Lingras and C. Butz, "Rough set based 1-v-1 and 1-v-r approaches to support vector machine multi-classification," Information Sciences, vol. 177, pp. 3782-3798, 9/15/ 2007. [12] M. S. Lotfabadi and R. Mahmoudie, "The Comparison of Different Classifiers for Precision Improvement in Image Retrieval," in Signal-Image Technology and Internet-Based Systems (SITIS), 2010 Sixth International Conference on, 2010, pp. 176-178. [13] C. Cortes and V. Vapnik, "Support-vector networks," Machine Learning, vol. 20, pp. 273-297, 1995/09/01 1995. [14] O. Chapelle, V. Vapnik, O. Bousquet, and S. Mukherjee, "Choosing Multiple Parameters for Support Vector Machines," Machine Learning, vol. 46, pp. 131-159, 2002/01/01 2002.