International Journal of Computer Engineering and Applications, Volume XII, Issue XII, Dec. 18, www.ijcea.com ISSN 2321-3469 A SURVEY ON THE METHODS USED FOR CONTENT BASED IMAGE RETRIEVAL T.Ezhilarasan 1, N.Sathya 2 1 Department of Computer Science Engineering, Government College of Technology, Coimbatore, Tamil Nadu, India. share2ezhil@gmail.com 2 Assistant Professor, Department of Information Technology, Sri Shakthi Institute of Engineering and Technology, Coimbatore, Tamil Nadu, India.sathyasivajan@gmail.com ABSTRACT: Content Based Image Retrieval (CBIR) is an image retrieval system which uses to retrieve similar images based on visual contents that are present in the images. The visual contents of the image can be known by extracting the features that are presented in the images. Low level features like color, texture and shape features can be extracted from the images and there are various methods available to extract these features. In this paper, survey is made on the several methods which are used to extract features that are presented in the image. Keywords: Content Based Image Retrieval, Color Features, Texture Features, Feature Extraction, Color Histogram. [1] INTRODUCTION In today s world, the data are created, shared, stored and managed in internet. Most of the data are in image and video format. The retrieval of data have two difficulties, they are retrieval of similar data and retrieving images in limited time. Here we consider image retrieval process in order to retrieve images from large collection of image stored in the database. Since 1992, this area had wide development. The traditional methods to find similar image are captioning, using keywords and providing description of images. The main problem here is to locate the desired image in large collection of database. While considering CBIR, we use visual contents like color, shape and texture features of image to retrieve the similar images Ezhilarasan.T and Sathya.N 1
A SURVEY ON THE METHODS USED FOR CONTENT BASED IMAGE RETRIEVAL from the database. These features can be extracted using several methods and descriptors. Retrieval of image is based on the features which are automatically extracted from image themselves. The survey had been made on the methods that are available to extract the color and texture features. [2] BASIC CBIR SYSTEM Content Based Image Retrieval framework retrieves the picture from large database by looking the features of the images like low level features and high level features. The most widely recognized low level features are Color, Texture and Shape. Typical CBIR framework enables client or user to present the query picture, and then the CBIR system will extract feature vectors for query picture and for collection of pictures in the database. By using similarity measurement techniques the extracted feature vectors are matched. At that point where the system finds the highest similarity values it will provide the results. The results present the thumbnails of these pictures on the screen Figure: 1. Architecture of Basic CBIR System 2.1 COLOR FEATURE: Color is the most basic quality of visual content. Images are examined based on the colors they contain. It is one of the most widely used techniques because it can be completed without regard to image size or orientation. Color feature can be obtained or calculated by quantizing the color spaces, RGB values, HSV values and computing distance measures based on color similarity, which is achieved by computing a color histogram for each image that identifies the proportion of pixels within an image holding specific values. There are various descriptors like Dominant Color Descriptor (DCD), Scalable Color Descriptor (SCD), Color Structure Descriptor (CSD), Color Layout Descriptor (CLD) are used to extract color features. And also there are several methods like Color Moments, Color Histogram, Color Correlogram and Auto Color Correlogram used to extract color features. 2.2 TEXTURE FEATURE: Texture is an important quality in order to describe an image. The texture descriptors will characterize the image textures or regions. They observe the region homogeneity and the histograms of the region borders. These features can look for patterns and spatial locations present in the image and they are quiet difficult to represent. But it can be represented by obtained relative pixel brightness, contrast and directionality etc, Some other methods to Ezhilarasan.T and Sathya.N 2
International Journal of Computer Engineering and Applications, Volume XII, Issue XII, Dec. 18, www.ijcea.com ISSN 2321-3469 classify image texture includes co-occurrence matrix, law texture energy, wavelet transform and other transforms. The region homogeneity and the histograms of the region borders can be obtained. The set of descriptors namely Homogeneous Texture Descriptor (HTD), Texture Browsing Descriptor (TBD), Edge Histogram Descriptor (EHD) can be used. 2.3 SHAPE FEATURE: Shape feature describes the region of image not exactly the shape of the image. It can be obtained by segmentation and edge detection of an image. Two main types of shape features are commonly used global features such as aspect ratio, circularity and moment invariants. Shape filters are used for extracting the shape features. Shape descriptors can also be used but it should be invariant to scaling, translation and rotation of images. Some shape descriptors includes Moment Variants functions and Fourier Transform functions. [3] METHODS USED IN CBIR SYSTEM Content Based Image Retrieval System proposed by Chun, Young Deok et.al. [1] is based on the combination of multi resolution color and texture features. Here, the color features are extracted by using autocorrelogram and texture feature are extracted by using Block Difference of Inverse Probabilities (BDIP) and Block Variation of Local Correlation Coefficients (BVLC). Color autocorrelogram describes the probability of finding pixels of identical color at a distance from given pixel. This is extracted using HSV rather than RGB, because HSV provides better correspondence of similar colors. BDIP effectively extracts the edges and valleys in the image where it expresses the maximum intensity variation in a block. BVLC measures the texture smoothness level according to four orientations of a block. The query feature vector is created for both color and texture features. Then the similarity is obtained between query and target feature vector. According to similarity ranks, it finally retrieves the given number of target images from the image database. In [2] the CBIR system is designed by Yue, Jun et.al., by using multi resolution color and texture features. Here, the color feature is based on color histogram and texture feature is based on co-occurrence matrixes which are extracted to form feature vector and then the weights are constructed for these feature vectors. Color feature neglects the spatial locations and it is difficult to segment. Some of the color feature extraction methods include Global Color Histogram and Block Color Histogram. In Global Color Histogram, HSV color space is used and the feature value is calculated. It is invariant to rotation and translation of images. In Block Color Histogram, images are separated in n x n blocks. For each block, we calculate color quantization and its weight coefficients are distributed. Texture features are extracted by converting images from RGB to grey scale images, feature values are calculated for grey scale images and internal normalization process is performed and then by using appropriate similarity measure texture feature comparison is done. The CBIR system proposed by Wang et.al., [3] is based on combining color, texture and shape feature information. For extracting color features RGB color space is divided into 8 coarse partitions. Then quantized color is selected by centroid of each partition just by calculating average value of color distribution for each partition center and then they calculate the mutual distance and merge similar color bins to obtain dominant color. For texture feature Ezhilarasan.T and Sathya.N 3
A SURVEY ON THE METHODS USED FOR CONTENT BASED IMAGE RETRIEVAL extraction steerable filters are used, and the steerable filter decomposition for the given image has been performed. Then the energy distribution of filtered image is obtained. Additionally mean and standard deviation are also calculated. Here, for extracting the shape feature pseudozernike moments are used. It is not scale or translation invariant and it is obtained by normalizing the image. Finally similarity for each feature has calculated. In [4] Singha et.al., had proposed CBIR system based on combination of color histogram and fast wavelet transformation. Totally three methods are used here. First method is retrieving images using color histogram. Here, RGB image is converted to HSV color space. Color quantization is carried out using color Histogram. Then the normalized histogram is found. Second method is wavelet based Color Histogram, where the image is decomposed using HAAR wavelet transform which produce approximate, vertical, horizontal and diagonal coefficients. They retrieve images based on combinations of these coefficients. Third method is Lifting wavelet based color histogram, lifting scheme used in HAAR wavelet transform. Decompose images using lifting scheme in first level wavelet transform to obtain the mentioned coefficients. Assigning weight to approximate and horizontal coefficients and convert it into HSV plane. Then normalized Histogram is calculated. Finally, the similarity matrix is calculated for each method to retrieve the images. In [5] Yildizer et.al., had proposed effective CBIR just by integrating wavelets with clustering and indexing. Here, they use clustering algorithm K-means and a database indexing structure B+ tree to retrieve relevant images. Three steps have been discussed in [5] they are feature extraction, model construction and query phase for image retrieval. The first step in feature extraction is to pre-process the image by resizing and transforming it into RGB values. Then, apply wavelet transformation to divide the images into high and low frequency bands. Model construction is used to find a model that reduces the search space without losing relevant images. K means clustering is most effective method for reducing the search space. Integration of K means with B+ tree is used to neglect the outliers. The CBIR system created by Anil Balaji Gonde et.al., [6] describes that the features can be collected by combining the Modification of Curvelet Transform (MCT) with vocabulary tree. Some of the components of MCT like Gabor transform, Ridgelet transform, energy histogram, and vocabulary tree have been discussed in [6]. Ridgelet transform represents the image with edges and curvilinear structure. This transform provides good result for straight line. The handling of large descriptor vectors can be solved by vocabulary tree. MCT gives better detailed sub band images and improves the result. Talib, Ahmed et.al., [7] proposed a new semantic color features are extracted from Dominant Color (DC) and is used for image retrieval process. Here, weighted Dominant Color Descriptor (DCD) is used to retrieve the dominant color from the image. The weight of the dominant color is computed based on the location of the image. Weighted dominant color is also called border weight because the weight of DC is equal to the frequency of it on the border. The color with high frequency at the image border will have high weight and the remaining will be assigned with low values. DCD ill improve the background effects. Salient Object Detection is effective for single object images. In [8], Shiv Ram Dubey et.al, describes the encoding of color and texture features of image from local neighborhood of each pixel. The local neighborhood contains more texture and shape information and plays an important role in human visual system. Color features are extracted by converting RGB color space into a single channel. Texture feature is encoded with Ezhilarasan.T and Sathya.N 4
International Journal of Computer Engineering and Applications, Volume XII, Issue XII, Dec. 18, www.ijcea.com ISSN 2321-3469 structuring patterns generated from structured elements. Rotation and scale invariant hybrid image descriptor is constructed by fusing the color and texture feature. Color image have been analysed by using the Color Difference Histogram (CHD). CHD is used to represent the image using color difference of two pixels for each color and edge orientation of an image. The RSHD descriptor can be used in those problems, where image description is needed. By using this descriptor the similar images for the query image have been retrieved. The result is evaluated by using precision and recall measure. Shiv Ram Dubey et.al, [9] proposed the Local neighborhood-based robust color occurrence descriptor (LCOD) and it is used to encode the color information present in the image. Color information is processed by reducing the number of colors into less number of shades. The reduced color shade information of the local neighborhood is used to compute the descriptor. LCOD is constructed by generating a local color occurrence binary pattern for each pixel in the image. The binary patterns of each pixel in the image are aggregated to find the single pattern. LCOD is more robust towards scale and rotation. LCOD is not better for the planner image because the quantization step will fail to produce the image with more information. Rashno, A et.al, [10] describes the most similar images are retrieved in CBIR by selecting the most relevant features among the complete feature set created by Ant Colony Optimization (ACO) based on feature selection. The nodes in ACO represent the feature and edges represent the selection of next feature. The main objective of ACO is to find a path with minimum cost in the graph. Generate a new set of ant population and assign a random feature to each ant and mark the feature as visited. Then, select the feature with highest probability from unvisited nodes and mark that feature as visited. Ant colony optimization is used to reduce the irrelevant and redundant features but ACO is more time consuming process. Table: 1. Benefits and Limitations of Existing Methods S.NO. TITLE OF THE PAPER BENEFITS LIMITATIONS 1 Content-Based Image Retrieval Using Multiresolution Color and Texture Features Young Deok Chun, Nam Chul Kim, Ick Hoon Jang, 2 Content-based image retrieval using color and texture fused features Yue, Jun, Zhenbo Li, Lu Liu, and Zetian Fu 1) Extract and Retrieves images with any resolution 1) Color Histogram is used to describe the distribution of color from an image 2) Global Color Histogram (GCH) are effectively useful for calculation and matching of image similarity 1) Performance degraded for the multiresolution database 1) Color histogram does not describe the local distribution of image in the color space and spatial position of each color. 2) GCH calculates only the frequency of color not the spatial distribution Ezhilarasan.T and Sathya.N 5
A SURVEY ON THE METHODS USED FOR CONTENT BASED IMAGE RETRIEVAL 3 An effective image retrieval scheme using color, texture and shape features Xiang-Yang Wang, Yong- Jian Yu, Hong-Ying Yang 4 Content-based image retrieval using the combination of the fast wavelet transformation and the colour histogram Singha, M., Hemachandran, K. and Paul, A. 5 Integrating wavelets with clustering and indexing for effective content-based image retrieval Yildizer E., Balci A. M., Jarada T. N., & Alhajj R 6 Modified curvelet transform with vocabulary tree for content based image retrieval, Anil Balaji Gonde, R.P. Maheshwari and R. Balasubramanian 7 A weighted dominant color descriptor for content-based image retrieval Talib, Ahmed, Massudi Mahmuddin, Husniza Husni, and Loay E.George 8 Rotation and scale invariant hybrid image descriptor and Retrieval Shiv Ram Dubey, Satish Kumar Singh, Rajat Kumar Singh 1) Dominant Color Descriptor (DCD) will represent the greater part of color with smaller color distance 1) Lifting scheme reduces the processing time to retrieve image 2) Wavelets are robust to colour intensity and can capture both texture and shape feature 1) Indexing schema of B+ tree enables us to reduce the total cost 2) B+ tree also reduces the total cost of the query phase 1) MCT gives better detailed sub-band images than curvelet transform and thus improves the results 1) Dominant color descriptor that can be used for effective object-based image retrieval 2) Dominant color descriptor method will improve the background effects 1) Rotation and Scale invariant hybrid image Descriptor (RSHD) is more robust towards rotation and scaling 2) RSHD is good because the neighboring structures are less influenced by the 1) DCD will cause incorrect ranks for images with similar color distribution 1) Increasing the number of sub areas by CH leads to increase in the use of memory and computational time 1) Image segmentation has not done for better clustering of images 1) Ridgelet transform gives good result for straight lines but the images mostly contained curved edges rather than straight lines 1) Weight Detection is effective for single object only and is difficult to work with complicated background 1) RSHD is 100% slower than Structure Element Histogram (SEH) Ezhilarasan.T and Sathya.N 6
International Journal of Computer Engineering and Applications, Volume XII, Issue XII, Dec. 18, www.ijcea.com ISSN 2321-3469 scaling of whole image. 9 Local neighbourhoodbased robust colour occurrence descriptor for colour image retrieval Shiv Ram Dubey, Satish Kumar Singh, Rajat Kumar Singh 10 An efficient content-based image retrieval with ant colony optimization feature selection schema based on wavelet and color features Rashno A., Sadri S., and Sadeghian Nejad H 1) LCOD descriptor are more robust towards geometric and photometric transformation and the time complexity of this descriptor is O(n) 1) All irrelevant and redundant features are dropped by ant colony optimization 1) LCOD descriptor is not well suited for planar image because quantization step will fail to provide images with more information 1) ACO feature selection is a time-consuming task [4] INFERENCE There are several steps in CBIR System: they are selection of image database, Extraction of low level features, Similarity Measurement and Performance Evaluation. In this paper survey is made on the several methods that are used to extract the low level features of an image. Extracting single feature like color or texture feature is not much effective but combining these color and texture feature, extracting these two features or more than two features are highly effective. Based on this perspective survey helps to understand different methods and used to play a vital role in understanding the descriptors for extracting the detailed information about the image. Ezhilarasan.T and Sathya.N 7
A SURVEY ON THE METHODS USED FOR CONTENT BASED IMAGE RETRIEVAL REFERENCES: [1] Chun, Young Deok, Nam Chul Kim, and Ick Hoon Jang: Content-based image retrieval using multiresolution color and texture features, IEEE Transactions on Multimedia, 2008, 10, (6), pp. 1073 1084 [2] Yue, Jun, Zhenbo Li, Lu Liu, and Zetian Fu.: Content-based image retrieval using color and texture fused features, Mathematical and Computer Modelling, 2011, 54, (3), pp. 1121 1127 [3] Wang, Xiang-Yang, Yong-Jian Yu, and Hong-Ying Yang: An effective image retrieval scheme using color, texture and shape features, Computer Standards & Interfaces, 2011, 33, (1), pp.59 68 [4] Singha, M., Hemachandran, K. and Paul, A.: Content-based image retrieval using the combination of the fast wavelet transformation and the colour histogram, IET Image Processing, 2012, 6, (9), pp. 1221 1226 [5] Yildizer, E., Balci, A. M., Jarada, T. N., & Alhajj, R.: Integrating wavelets with clustering and indexing for effective content-based image retrieval, Knowledge-Based Systems, 2012, 31, pp. 55 66 [6] Anil Balaji Gonde, R.P. Maheshwari and R. Balasubramanian: Modified curvelet transform with vocabulary tree for content based image retrieval, Digital Signal Processing, 2013, 23, (1), pp. 142 150 [7] Talib, Ahmed, Massudi Mahmuddin, Husniza Husni, and Loay E. George: A weighted dominant color descriptor for content-based image retrieval, Journal of Visual Communication and Image Representation, 2013, 24, (3), pp. 345 360 [8] Shiv Ram Dubey, Satish Kumar Singh, and Rajat Kumar Singh: Rotation and scale invariant hybrid image descriptor and retrieval, Computers & Electrical Engineering, 2015, 46, pp.288 302 [9] Dubey, Shiv Ram, Satish Kumar Singh, and Rajat Kumar Singh: Local neighbourhood-based robust colour occurrence descriptor for colour image retrieval, IET Image Processing, 2015, 9, (7), pp. 578 586 [10] Rashno, A., Sadri, S., and SadeghianNejad, H.: An efficient content-based image retrieval with ant colony optimization feature selection schema based on wavelet and color features, International Symposium on Artificial Intelligence and Signal Processing (AISP), Mashhad,Iran, 2015, pp. 59 64. Ezhilarasan.T and Sathya.N 8