Bipartite Graph Partitioning and Content-based Image Clustering

Bipartite Graph Partitioning and Content-based Image Clustering Guoping Qiu School of Computer Science The University of Nottingham qiu @ cs.nott.ac.uk Abstract This paper presents a method to model the images and their content descriptors in large image databases using bipartite graphs. A graph partitioning algorithm is then developed to cluster the images and their content description features simultaneously such that each cluster is automatically associated with the set of features that best describes its visual contents. The association of features with image clusters enables semantic based search of image databases and the division of the database into visually aligned hierarchical groups facilitates fast content-based image retrieval. Introduction Managing large image repositories, making image and video items easily searchable is a very challenging problem. Content-based image indexing and retrieval (CBIR) is a popular approach to finding solutions to this problem []. In traditional CBIR techniques, low level image features, such as colour histogram, texture descriptors and others are used to represent image contents. Naïve approaches derive image features from the entire image, while more sophisticated approaches using advanced image segmentation techniques to divide image into meaningful regions or possible objects before deriving low level descriptors []. Once a set of content descriptors has been computed, pattern matching techniques are employed to compare the similarity between the query example and images in the database. Each image in the database is then given a score to indicate the degree of similarity between the image and the query image. Although many techniques have been proposed in the literature, many computational and user interface issues in CBIR remain very difficult. Most approaches use more than one type of visual features to represent the image content. Ultimately, a single score has to be computed from the multi-feature descriptors. In the case of colour histogram based approaches, for example, the most frequently used method for computing the similarity between images based on their colour histograms is to compute the L or L norms between the histograms. Statistics has shown that colour histograms of individual images tend to be sparse, with most of the pixels in any one image concentrated on a few colour bins. This means, certain colours are more important than others for certain images. However, the L or L norms do not give out information regarding which colour bins are more important to a given image. Each colour bin can be regarded as a different feature (colour). Knowing which feature is more important to a given image can be very useful in providing the semantics of the image, which can help more effective searching and browsing through the database. For example, a certain image s histogram has larger counts in the bluish colour bins, this means that the colour theme of the image is bluish. This bluish can be regarded as a semantic term of the image thus allowing a type of semantic based searching and browsing. This paper presents a novel approach to image database indexing and retrieval. We propose to use bipartite graph to model the images and their contents simultaneously. For the two sets of vertices of a bipartite graph, we use one set to associate with the image content descriptors and the other to associate with the images. The edges linking these two sets of vertices measure the degrees of associations between the content descriptors and the images. We then introduce graph partitioning to cut the graph such that images and the features that are most strongly associated with each other are clustered into the same groups. Each group formed in this way manifests a certain theme which is strongly linked to its associated features, which in turn provides the content semantics for the images, thus facilitating image indexing and retrieval. Graph partitioning has many very well established theoretical results, which makes the approach very attractive. Recent computer vision literature includes the use of graph partitioning for image segmentation [, 3]. In the non vision literature, graph theory was used quite extensively for document clustering [4, 5]. To the knowledge of the authors, this work represents the first time that graph partitioning is employed for the co-clustering of content descriptors and images for image database application. Although graph partition problem is NPcomplete, which will make graph based methods computationally unattractive, recent results have shown that graph partition can be efficiently computed by using spectral algorithms or eigenvector methods [-5]. Bipartite graph can also be partitioned by a Hopfield network [6]. In this work, we develop a

Hopfield network based solution for the partitioning of bipartite graph of image database. The organisation of the paper is as follows. In the following section we present the use of bipartite graphs for the modelling of image databases. Section 3 present a Hopfield network based approach to the partitioning of the image database bipartite graph. Section 4 presents experimental results and section 5 conclude the paper. Modelling Image Database using Bipartite Graph. Bipartite Graph and Its Partitioning A graph G = (V, E) is a set of vertices V = {,,, V } and a set of edges {k, l} each with the edge weight E kl. A graph G = (V, E) is a bipartite graph if it consists of two classes of vertex, X and Y, V = X Y, X Y =, and each edge in E has one endpoint in X and the other endpoint in Y. We denote an undirected bipartite graph by the triple G = (X, Y, W), where X = {x, x, x m }, Y = {y, y, y n }, W = {w kl }, where w kl > 0 is the weight between vertices k and l, w kl = 0 if there is no edge between vertices k and l. We consider the case of bipartitioning the bipartite graph, where the vertices of X are partitioned into two sub sets X = X X, and simultaneously, the vertices of Y are also partitioned into two sub sets Y = Y Y. Depending on the applications, different graph partitioning criterions can be introduced to partition the vertices. In graph terminology, the partitioning is measured by Cuts. The criterions that have been used in the literature include Min Cut, Min Max Cut, Ratio Cut and Normalized Cut [-5]. For these criterions, spectral graph partitioning algorithms have been developed to compute the cuts efficiently. For the sake of concise presentation, we omit the formal definitions and readers are referred to various references for details. Another type of approaches to graph partitioning is to use neural network models [6]. In this method, a socalled computational energy function is defined and iterative processes are used to optimise the computational function. In this paper, we develop a bipartite partitioning algorithm for image database modelling application based on the later approach.. Content-based Image Indexing and Retrieval Content-based image indexing and retrieval is an important area in which computer vision and image processing play a significant role. But unlike classical vision tasks, such as object recognition, the demand of a CBIR system on vision techniques is less precise. Although many current vision algorithms are either not stable or restricted in very narrow application domains, they are sufficient in providing useful solutions to a CBIR system. Although various approaches have been proposed to represent image contents, techniques based on first order statistics, or histograms of low level features, such as colour histogram, colour correlogram, and MPEG-7 colour structure histogram [], are popular and have been found to be effective. For presentation purpose, we will use colour histogram based content descriptor in our discussion. Extension to other descriptors is straightforward. Let H = {h(c i )}, where h(c i ) is the count of occurrences of the ith colour c i. Each bin in the histogram is associated with a certain low-level properties, in this case colours. The value of h(c i ) indicates an association of the colour c i with the image. Roughly speaking, a larger value indicates that there are more pixels in the image have colours close to that colour, and that colour is more important in relation to the visual appearance of the image. In traditional content based image retrieval, image similarity is based on either L or L norms of histograms. The relative importance of each colour bin to an image is not explicitly used. Knowing which low level features (colours) are more strongly associated with an image can provide useful information to search for images. For example, if the bluish bins of an image s histogram have large counts, then it can be reasoned that the image contains bluish colour scenes. This bluish colour theme can be regarded as a semantic term of that image, which will in turn enable semantic based image retrieval. How the semantics can be used for more effective image database management is not the topic of the current paper. The contribution of this paper is to present a method to associate low level features with images based on a novel graph partitioning approach..3 Simultaneous Modelling Contents and Images We wish to model image content descriptors and images of an image database simultaneously using a bipartite graph G = (X, Y, W). Assuming each image in the database is represented by an m-bin colour histogram, and H l = (h l (c ), h l (c ),, h l (c m )), denotes the histogram of the lth image, where l =,, n. A straightforward mapping of the colours, the histograms and the images to G = (X, Y, W) is as follows: X = (c, c,, c m ), i.e., each vertex in X corresponds to a colour. Y = {y, y, y n }, and y l represents the lth image in the database. {w kl } = { h l (c k ) }, k =,,, m, l =,,, n, i.e., the weight of the edge (k, l) is the kth colour bin count of the lth image. In this way, an image database is completely characterised by the bipartite graph G = (X, Y, W). In the next section, we present an algorithm to partition

the graph such that images and their most important features are clustered simultaneously. 3 A Bipartite Graph Partitioning Algorithm for Image Database Clustering Graph partition is in general NP-complete. However, recent research using eigenvector or spectral methods for graph partitioning have demonstrated that the problem can be solved quite efficiently. A key factor that affects the quality of the solution is the cut criterions used. This criterion is inevitably task dependent. We introduce criterions that are suitable for image database applications [-5]. A cut partitions X = X X, and Y = Y Y. Let us assume that X is paired with Y and X paired with Y. In our current setting, this means, the colours being partitioned into X are more strongly associated with images that are partitioned into Y, and the relations between X and Y are similarly defined. The following objective function defines a reasonable criterion for the partitioning J = AS AS( X, Y ) + AS( X, Y ) ( X, Y ) AS( X, Y ) where AS( X, Y ) = w kl k X, l Y and other three terms are similarly defined. () Maximising J is equivalent to maximising the first two terms and minimising the last two terms. The meaning of the criterion can be easily understood. Maximising AS(X, Y ) means that images partitioned into sub set Y are strongly associated with colours being partitioned into the sub set X. Maximising AS(X, Y ) has similar explanation. Minimising AS(X, Y ) means that image partitioned into sub set Y are least associated with colours being partitioned into the sub set X. Minimising AS(X, Y ) can be understood similarly. Although () is a sensible criterion, it could produce unbalanced cut in the sense that the size of any of the sub sets, X, X, Y and Y could be very small even empty, because it can be easily shown that the criterion of () is equivalent to Min Cut in graph theory. Whether other criterions, Ratio Cut, Min Max Cut and Normalized Cut will suit our current application needs further study. We here introduce another objective function which will produced a more balanced partition. First let us define a weight for each of the vertices in X and Y w k = w w l = w () x ( ) kl y ( ) l If the histograms are normalised, then w y (l)=. We then define the following objective function k kl J = λj λ ( ) ( ) wx k wx k k X k X (3) λ 3 () () wy l wy l l Y l Y where λ, λ and λ 3 are non-negative weighting constants. The new objective function is based on J and two new terms. The physical meaning of the first new terms is that, we want to partition the colours in such a way that, the total number of pixels accumulated over the whole database, should split equally between the two groups of colours. If the database is large, this condition makes reasonable statistical sense. The second new term in fact favours the two groups Y and Y have equal number of images (if all histograms are normalised). This second new term is somewhat artificial. We will investigate and explain its impacts on the partition in the next section. 3. A Neural Network based Graph Partitioning Algorithm In order to partition the graph in such a way the J is maximised, we here present a solution based on the Hopfield neural computational models [6]. We assign a binary variable to each vertex and for convenience using the following notations: x k = + if x k X, x k = - if x k X, y l = + if y l Y,, y l = - if y l Y, for k, l. We now re-write (3) in terms of x k and y l : J m n = λ xk yl wkl λ ) k = l= k = l = m n ( xk wx ( k) ) λ3 ( yl wy ( l ) (4) then, J in (4) can be optimised by a Hopfield neural model. However, one difficulty of (4) is that the three terms each has different importance, which have to be determined a priori by using appropriate weighting constants (this is a general a difficult problem in computer vision and pattern recognition and generally no systematic solutions are available). To avoid this we decided to optimise each term in (4) in turn. In order to prevent the algorithm from getting trapped in local minima, we optimise the terms in a stochastic manner. Using the Hopfield network, we have the algorithm described in pseudocode in the algorithm box. Briefly, the algorithm first assigns random numbers to the states of the vertices. It then picks a random vertex from X and updates its state in such a way that the first term in (4) is increased. It then picks a random vertex from Y and updates it in such as way that the first term in (4) is increased. If the second term is used in the partition, then the algorithm picks another random vertex from X, this time the state of the vertex is updated to increase the second term in (4). If the third term is used in the partitioning, then a random vertex from Y is picked and its state updated such that the third term in (4) is increased. This process is repeated until either a pre-

set maximum number of iterations is reached or until further changes in the vertices states do not changes the objective function s value. Prco Hopfield Network Bipartite Graph Bipartitioning Algorithm for k = 0 to m x[k] = random (-, ) //random number between and + for l = 0 to n y[l] = random (-, ) //random number between and + while (not converge or less than Max Iterations) do //Pick a random vertex from X, and update its state in such a way that the first term in (4) is increased k = random (m) // a random between 0 and m H[k] = 0 for l =0 to n H[k] += - y[l]*w[k][l] //w[k][l] = w kl if H[k] 0 then x[k] = + else x[k] = - //Pick a random vertex from Y, and update its state in such a way that the first term in (4) is increased l = random (n) // a random between 0 and n H[l] = 0 for k =0 to m H[l] += - x[k]*w[k][l] //w[k][l] = w kl if H[l] 0 then y[l] = + else y[l] = - // If λ 0, pick a random vertex from X and update its state such that this term is increased. k = random (m) // a random between 0 and m H[k] = 0 for l =0 to m H[k] += x[l]*wx[l] // wx[l] = w x (l) if H[k] 0 then x[k] = + else x[k] = - // If λ 3 0, pick a random vertex from Y and update its state such that this term is increased. End while End Proc l = random (n) // a random between 0 and n H[l] = 0 for k =0 to n H[l] += y[k]*wy[k] // wy[k] = w y (k) if H[l] 0 then y[l] = + else y[l] = - 4 Experimental Results We have applied the graph partitioning method to image database clustering. In the implementation, we use a simple frequency classified colour histogram descriptor to represent the content of each image. Each histogram consists of 56 bins divided into 4 bands each band consists of the same 64 colours. Each band collects pixels from image regions of different frequencies. The first 64 bins counts the colours occur in the low frequency (smooth) regions, the second 64 bins counts colours occur in the next higher frequency band and so on. This way the counts in the bins not only reflect the colour but also texture properties of the image as well. If most of the counts concentrated in the first 64 bins, then the image is mostly smooth, conversely, if the counts are concentrated in the last 64 bins, then the image contains very busy texture surfaces.

(00) (0) Bin Map (00) Bin Map (0) Figure The first level groups partitioned by the proposed algorithm. Also shown are associated histogram bins. Each row of the bin map consists of 64 colours in each frequency band. An empty (white) block indicates that colour is not associated with that group. For each image in the database, a 56 bin histogram is constructed. We then apply the bipartite graph partitioning algorithm to cluster the histogram bins and the images simultaneously in a recursively manner. The algorithm is first applied to the entire database to divide it into two groups. The resultant groups are then partitioned again. In each subsequent application of the algorithm, all the bin counts of the histograms of the images in each sub group were used instead only those bins that are associated with the group in previous round of partitioning. It can be easily understood that this way, the images in the database can be put into a binary tree data structure with each node holds information about the images and their associated colour bins. These colour bins contains information about the nature of the contents of those images held in that node, which can in turn be used for content based image retrieval, either based on the semantics of those images, e.g., the colour themes of the images and the texture roughness of the images, or based on query by example paradigm. Putting the database in a binary tree will help fast search. One possible search strategy could be, basing on the bin partition at each node, branching to the next level based on which set of bins of the two groups contains more pixels in the query image. This is in contrast to traditional full search both in the sense of the use of the full histogram and search the entire database. This search method will definitely be faster than full database search. We also expect it will perform better because the features (colour bins) are used in a selective way. Work is currently underway to evaluate the potential of this technique for content-based image retrieval both based on a semantic approach and a query by example approach. Here we present results on the classification of two colour texture databases using the bipartite graph partitioning algorithm presented in this paper. Figures and show results of partitioning a 70-image colour texture database by a one side balanced cut (λ = λ =, λ 3 = 0, note that the actual values of the weighting constants are irrelevant if non-zero). Also shown are the colours associated with each cluster. It is seen that the groupings are visually similar and the bin maps have strong association with the appearance of the images in the groups. We mentioned previously that the third term in (3) and (4) are somewhat artificial because we force the partition to put equal number of images into each group without regarding to their actual contents. Figure 3 shows such a partitioning using all three terms in J. It is clearly seen that images in each group are somewhat similar, however, each group is less homogeneous and has many visual clutters as compared with cuts without the third term. Figure 4 shows 8 groups of images partitioned from another colour texture database consisted of 088 images (λ 3 = 0). If is again seen that each group contains image with similar visual attributes. From these results, it can be said that the algorithm has succeeded in clustering the features and images simultaneously. In our simulations, we set the iteration number to 5000 and it took less than minute on a Pentium 4 PC to cluster the 088 image database into 3 level binary tree (4 hierarchical clusters). 5 Concluding Remarks We have presented a method to model images and their content description features in a large image

database using bipartite graphs. We have also presented a method to partition the graph in a balanced and meaningful manner. Such model enables the simultaneous classification of images and their features, which in turn automatically associates images and their most important visual attributes. Such an association can facilitate both semantic based and query by example based image database search. References [] A. W. M. Smeulders et al, "Content-based image retrieval at the end of the early years", IEEE Trans PAMI, vol., pp. 349-380, 000 [] Y. Weiss, Segmentation using eigenvectors: a unifying view, ICCV 999 [3] J. Shi and J. Malik, Normalized cut and image segmentation, IEEE PAMI, vol, pp. 888 905, 000 [4] I. Dhillon, Co-clustering documents and words using bipartite spectral graph partitioing, ACM Knowledge Discovery Data Mining KDD 0, pp. 69 74 [5] C. Ding etal, A Min-max cut algorithm for graph partitioning and data clustering, IEEE st Conference on Data Mining, 00, pp. 07 4 [6] J. Hertz, R. G. Palmer and A. Koch, Introduction to the Theory of Neural Computation. Perseus Publishing, 99 (000) (00) (Bin Map 00) (Bin Map 000) (00) (0) (LCI Map 00) Bin Map (0) Figure, The second level 4 groups of images and their associated colour bin maps. For explanation of the colour bin map, see captions in Figure.

(00) (0) Bin Map (00) Bin Map (0) Figure 3, Two groups at the second level of partitioning based on all three terms of the objective function. For explanation of the colour bin map, see captions in Figure. (0000) (000) (0) (00) Figure 4 (part A), of the 8 visual groups at the 3 rd level of the partitioning hierarchy (continued)

(000) (00) (00) (0) Figure 4 (part B), 6 of the 8 visual groups at the 3 rd level of the partitioning hierarchy