Shahabi Lotfabadi, M., Shiratuddin, M.F. and Wong, K.W. (2013) Content Based Image Retrieval system with a combination of rough set and support vector machine. In: 9th Annual International Joint Conferences on Computer, Information, Systems Sciences, & Engineering (CISSE), 12-14 December Content Based Image Retrieval system with a combination of Rough Set and Support Vector Machine Maryam Shahabi lotfabadi, Mohd Fairuz Shiratuddin, Kok Wai Wong Abstract In this paper, a classifier based on a combination of Rough Set and 1-v-1 (one-versus-one) Support Vector Machine for Content Based Image Retrieval system is presented. Some problems of 1-v-1 Support Vector Machine can be reduced using Rough Set. With Rough Set, a 1-v-1 Support Vector Machine can provide good results when dealing with incomplete and uncertain data and features. In addition, boundary region in Rough Set can reduce the error rate. Storage requirements are reduced when compared to the conventional 1-v-1 Support Vector Machine. This classifier has better semantic interpretation of the classification process. We compare our Content Based Image Retrieval system with other image retrieval systems that uses neural network, K-nearest neighbour and Support Vector Machine as the classifier in their methodology. Experiments are carried out using a standard Corel dataset to test the accuracy and robustness of the proposed system. The experiment results show the proposed method can retrieve images more efficiently than other methods in comparison. Index Terms Classifier, Content Based Image Retrieval system, Rough Set, Support Vector Machine I I. INTRODUCTION mage Classification is one of the most important aspects in Content Based Image Retrieval (CBIR) systems [1, 2], therefore using the appropriate classifier for CBIR systems is critical. Many past papers used different classifiers for their CBIR systems [3-5], however some problems still remain. Some of the drawback of current classifiers is the lengthy training time [6], high storage requirements [7], did not achieve the required semantic results [7] and cannot deal with incomplete and uncertain data and features [8]. In this paper, we used a combination of Rough Set and 1-v-1 Support Vector Machine (SVM) as the classifier for our CBIR system (which was proposed in [9]). In the 1-v- 1 SVM, one SVM is constructed for each pair of classes [10]. The reasons why the proposed Rough Set with SVM as a classifier has better result compared to the conventional SVM classifier due to the fact that conventional SVM has high storage requirements and lack of semantic interpretation of classification process [10, 11]. Rough set can reduce the storage requirements by using upper and lower approximations. This means that the rules should store in conventional 1-v-1 SVM, however this amount is reduced to for classifier include the Rough Set and 1-v-1 SVM [11]. In addition, combination of Rough Set and 1-v-1 SVM can provide a better semantic interpretation of the classification process using properties of the Rough Set boundary region [9]. This paper is organised as follows: section 2 presents rough set to 1-v-1 support vector machine classifier. Experiment setup is given in section 3. In section 4, we show experimental results. Conclusion is made in section 5. II. ROUGH SET METHOD TO SUPPORT VECTOR MACHINE 1-V-1 MULTI CLASSIFIERS This section describes a rough set method to SVM 1-v- 1 multi classifiers as proposed in [9]. First, a nonlinear separable feature space is transformed to a linear separable feature space using a Radial Basis Function Kernel (RBF Kernel). The reason for choosing this kernel is RBF kernel has better results in CBIR systems [12]. The perfect situation is that the SVM can find the hyperplane by maximizing the margin between two classes and no example are in the margin i.e. after transforming nonlinear feature space to linear feature space [13, 14] (see Fig. 1). Maryam Shahabi Lotfabadi. Author is a PhD student at the School of Engineering & Information Technology, Murdoch University, 90 South Street, Murdoch, Western Australia 6150, Australia (e-mail: m.shahabi@murdoch.edu.au) Mohd Fairuz Shiratuddin. Author is a Senior Lecturer at the School of Engineering & Information Technology, Murdoch University, 90 South Street, Murdoch, Western Australia 6150, Australia (e-mail: f.shiratuddin@murdoch.edu.au) Kok Wai Wong. Author is an Associate Professor at the School of Engineering & Information Technology, Murdoch University, 90 South Street, Murdoch, Western Australia 6150, Australia (e-mail: k.wong@murdoch.edu.au) Fig. 1. Maximizing the margin between two classes
However when there are some examples between the margin, applying a method which can deal with vague and uncertain spaces like Rough Set is essential. The margin can be used as the Rough Set boundary region. Using the formulas shown below, Rough Set is applied to SVM, and and correspond to the boundaries of the margin in Fig. 3,4,5 (red lines). is defined as follows:, for all, and there exists at least one training example such that and. all is defined as follows:, for, and there exist at least one training example such that and. The above variables are defined as follows: Assume is an input vector in the input space and is the output in. Training set used for supervised classification is. is inner product and and are components of two vector and. Fig. 3. A rough set method to SVM 1-v-1 classification for classes Flower and Elephant According to R1, R2 and R3 rules, a Rough Set based SVM binary classifier can be defined when: [R1] If, classification of x is positive (+1). [R2] If, classification of x is negative (-1). [R3] Otherwise, classification of is uncertain. In the SVM 1-v-1 multi classifier, one binary SVM is constructed for each pair,, of classes. According to the rules of R1, R2 and R3, three equivalence classes can be defined for each pair., and are the set of (or region) that follows the rule R1, R2 and R3 respectively. Lower and upper approximations and boundary region for class and are summarised in Table 1. Classification problem with three classes i.e. Flower, Elephant, and African people is shown in Fig. 2. Fig. 3, Fig. 4 and Fig. 5 show the Rough Set method to SVM 1- v-1 classification for classes Flower and Elephant, Flower and African people and Elephant and African people respectively. Fig. 4. A rough set method to SVM 1-v-1 classification for classes Flower and African people Fig. 5. A rough set method to SVM 1-v-1 classification for classes Elephant and African people Fig. 2. A classification problem including three classes Flower, Elephant and African people III. EXPERIMENT SETUP Our proposed CBIR system in this paper has training and testing phases. For each semantic group, two rules are extracted as we used 10 semantic image groups in our experimental results so that 20 rules are extracted in the training phase ( ). In the testing phase, the user feed
a query image to CBIR system. According to rules from the training phase, if the query image is related to the positive region, images in that positive region are shown to user as the retrieval results. However, if the query image is related to the boundary region using a threshold, the images which are most likely similar to query image is then shown to user. From many experimental results, a threshold was defined for each of the semantic image groups. For each semantic group, if distance between image in boundary region and images in the positive region is less than the threshold, the image in the boundary region will be categorized in the positive region group. IV. EXPERIMENT RESULTS In this section, the results that compare the three retrieval systems with the proposed retrieval system are presented. These three retrieval systems used SVM [7], neural network (NN) [8] and K-nearest neighbour (KNN) [6] as the classifier in their methodology. To investigate the function of the image retrieval system based on the above mentioned methods, we used the COREL image database containing one thousand images. In this database, images are classified into ten semantic groups. The groups are African people, beach, bus, flower, mountains, elephant, horse, food, dinosaur, and building. We expressed the results of each group with a number e.g. number 1 represents African people; number 5 represents mountains, and etc. A. Precision-Recall Graph Recall equals to the number of the related retrieval images to the number of the related images available in the images database. The precision equals to the number of the related retrieval images to all of the retrieval images [2]. Fig. 6 shows the precision-recall graph for ten semantic groups that is used for measuring the efficiency of the proposed retrieval system. In the experiment results, the proposed retrieval system is shown as RSSVM. From the graph, we observed that the proposed retrieval system achieved better results than the other three systems. The reason for this is the proposed method is the ability to handle the uncertain boundaries better, thus able to classify those in the region more accurately. B. The Investigation of the Retrieval Precision To investigate the total precision of the above mentioned retrieval systems, 1000 images are fed into the system as the queried images. The average of the retrieval precision is calculated for each class. Fig. 7 shows the results using different classifiers. As anticipated, the results are better using the proposed system. The average of the retrieval precision is 55.9%, 59.4% and 68.1% for SVM, NN and KNN respectively. It increases to 73.8% for RSSVM. Fig. 7. Precision of retrieval The reasons behind the superiority of RSSVM are the: 1) Overlapped region in the classification problem can be described using boundary region in Rough Set more accurately. 2) Optimal separating hyper-plane by maximizing the margin is constructed using SVM effectively. 3) Perfect generation ability is the SVM s properties, however cannot deal with imprecise or incomplete data. 4) Most important properties of rough set is that it can deal with vague and incomplete data efficiently. In addition, the RSSVM classifier has some advantages compared to the conventional SVM. One of the advantages is RSSVM reduced storage requirements. RSSVM requires to store just rules for each class (one rule for lower approximation and one rule for upper approximation), compared to conventional SVM that needs to store rules [10]. Another advantage of RSSVM is that it has better sematic interpretation of the classification process compared to the conventional SVM [11]. C. The Image Comparison of the Retrieval Systems In the final test, we presented the retrieval results for the queried flower image (Fig. 8). The first, second, and up to the fourth row in Fig. 9 is related to RSSVM, KNN, NN and SVM respectively. Referring to Fig. 9, the retrieval system with the RSSVM classifier represented a more related output images to the user. The first left image in Fig. 9 matched closely to the queried image. Fig. 6. Precision- recall graph
Fig. 8. Query image The reason why the proposed method has better results than those in other retrieval systems is that the rules extracted from the RSSVM classifier are semantic and can better classifies images. Consequently, the RSSVM classifier can show more relevant images to user. V. CONCLUSION This paper proposed a classifier based on a combination of Rough Set and a 1-v-1 Support Vector Machine (SVM) in a Content Based Image Retrieval (CBIR) system. The proposed image retrieval system is compared with other image retrieval systems which used other classifiers such as neural network, K-nearest neighbour and Support Vector Machine in their systems. Based on the experimental results, it can be concluded that CBIR system with Rough Set and SVM classifier has good results and better semantic interpretation of the classification process compared to conventional SVM. In addition, this classifier reduced the storage requirements because it only requires to store a rules. In this paper we focused on a 1-v-1 (one-versus-one) support vector machine for future it is good idea will do some research on 1-v-r (one-versus-rest) support vector machine, and evaluated results using combination of rough set and 1-v-r support vector machine as a classifier in content based image retrieval systems. Fig. 9. Retrieved images according to: first raw- RSSVM, second raw- KNN, third raw- NN, fourth raw- SVM. Table 1. Lower and upper approximations and boundary region for class and Class Class Lower approximation Upper approximation Over all lower approximation for class Over all Boundary region for class Over all upper approximation for class Some rules extract from above formula ( ) ( ) ( ) ( ) ( ) ( )
REFERENCES [1] Ritendra Datta, Dhiraj Joshi, Jia Li, and J. Z. Wang, "Image Retrieval: Ideas, Influences, and Trends of the New Age," ACM Computing Surveys, vol. 40, pp. 1-60, 2008. [2] T. Dharani and I. L. Aroquiaraj, "A survey on content based image retrieval," in Pattern Recognition, Informatics and Medical Engineering (PRIME), 2013 International Conference on, 2013, pp. 485-490. [3] Remco C. Veltkamp and M. Tanase, "Content-Based Image Retrieval Systems: A Survey," UU-CS2002. [4] Y. Xun, H. Xian-Sheng, W. Meng, G. J. Qi, and W. Xiu- Qing, "A Novel Multiple Instance Learning Approach for Image Retrieval Based on Adaboost Feature Selection," in Multimedia and Expo, 2007 IEEE International Conference on, 2007, pp. 1491-1494. [5] E. Yildizer, A. M. Balci, M. Hassan, and R. Alhajj, "Efficient content-based image retrieval using Multiple Support Vector Machines Ensemble," Expert Systems with Applications, vol. 39, pp. 2385-2396, 2012. [6] Z.-M. Lu, H. Burkhardt, and S. Boehmer, "Fast Content- Based Image Retrieval Based on Equal-Average K-Nearest- Neighbor Search Schemes," in Advances in Multimedia Information Processing - PCM 2006. vol. 4261, Y. Zhuang, S.-Q. Yang, Y. Rui, and Q. He, Eds., ed: Springer Berlin Heidelberg, 2006, pp. 167-174. [7] Xiaohong Yu and H. Liu, "Image Semantic Classification Using SVM In Image Retrieval " Proceedings of the Second Symposium International Computer Science and Computational Technology(ISCSCT 09), pp. 458-461, 2009. [8] B. V. Jyothi and K. Eswaran, "Comparative Study of Neural Networks for Image Retrieval," in Intelligent Systems, Modelling and Simulation (ISMS), 2010 International Conference on, 2010, pp. 199-203. [9] P. Lingras and C. Butz, "Reducing the Storage Requirements of 1-v-1 Support Vector Machine Multi-classifiers," in Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing. vol. 3642, D. Ślęzak, J. Yao, J. Peters, W. Ziarko, and X. Hu, Eds., ed: Springer Berlin Heidelberg, 2005, pp. 166-173. [10] X. Zhang, J. Zhou, J. Guo, Q. Zou, and Z. Huang, "Vibrant fault diagnosis for hydroelectric generator units with a new combination of rough sets and support vector machine," Expert Systems with Applications, vol. 39, pp. 2621-2628, 2012. [11] P. Lingras and C. Butz, "Rough set based 1-v-1 and 1-v-r approaches to support vector machine multi-classification," Information Sciences, vol. 177, pp. 3782-3798, 9/15/ 2007. [12] M. S. Lotfabadi and R. Mahmoudie, "The Comparison of Different Classifiers for Precision Improvement in Image Retrieval," in Signal-Image Technology and Internet-Based Systems (SITIS), 2010 Sixth International Conference on, 2010, pp. 176-178. [13] C. Cortes and V. Vapnik, "Support-vector networks," Machine Learning, vol. 20, pp. 273-297, 1995/09/01 1995. [14] O. Chapelle, V. Vapnik, O. Bousquet, and S. Mukherjee, "Choosing Multiple Parameters for Support Vector Machines," Machine Learning, vol. 46, pp. 131-159, 2002/01/01 2002.