Accurate Approximation of the Earth Mover s Distance in Linear Time

Size: px
Start display at page:

Download "Accurate Approximation of the Earth Mover s Distance in Linear Time"

Transcription

1 Jang MH, Kim SW, Faloutsos C et al. Accurate approximation of the earth mover s distance in linear time. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 29(1): Jan DOI /s Accurate Approximation of the Earth Mover s Distance in Linear Time Min-Hee Jang 1, Sang-Wook Kim 2,, Member, ACM, IEEE, Christos Faloutsos 1, Fellow, ACM, Member, IEEE and Sunju Park 3 1 Computer Science Department, Carnegie Mellon University, Pittsburgh, PA 15213, U.S.A. 2 Department of Electronics and Computer Engineering, Hanyang University, Seoul , Korea 3 School of Business, Yonsei University, Seoul , Korea zzmini@cs.cmu.edu; wook@hanyang.ac.kr; christos@cs.cmu.edu; boxenju@yonsei.ac.kr Received January 16, 2013; revised October 28, Abstract Color descriptors are one of the important features used in content-based image retrieval. The dominant color descriptor (DCD) represents a few perceptually dominant colors in an image through color quantization. For image retrieval based on DCD, the earth mover s distance (EMD) and the optimal color composition distance were proposed to measure the dissimilarity between two images. Although providing good retrieval results, both methods are too time-consuming to be used in a large image database. To solve the problem, we propose a new distance function that calculates an approximate earth mover s distance in linear time. To calculate the dissimilarity in linear time, the proposed approach employs the space-filling curve for multidimensional color space. To improve the accuracy, the proposed approach uses multiple curves and adjusts the color positions. As a result, our approach achieves order-of-magnitude time improvement but incurs small errors. We have performed extensive experiments to show the effectiveness and efficiency of the proposed approach. The results reveal that our approach achieves almost the same results with the EMD in linear time. Keywords earth mover s distance, approximation, content-based image retrieval 1 Introduction As the interest in applications using multimedia contents grows fast, the importance of image retrieval has also increased [1]. Many studies have worked on various techniques for accurately retrieving the image(s) the human user wants [2-4]. Content-based image retrieval (CBIR) is the technique of searching for images in large databases [2,5-6]. The human user defines a query image as an example, and the CBIR finds images similar to the query image based on the color distribution, shapes, and textures of the image. This paper deals with the color-based image retrieval problem in CBIR. How to represent the distribution of colors in an image has a significant implication in color-based image retrieval. The method traditionally used is the color histogram that stores the information about the number of pixels for each color within a fixed set of color ranges. Although being able to represent the entire range of colors (by counting the number of pixels for each possible color), the color histogram may not be a suitable representation for image retrieval for two reasons [7-8]. First, representing and using all possible colors may slow down the search. Second, since only a fraction of the color ranges in a histogram may be useful, using the entire range of colors in an image does not necessarily improve the accuracy of search. Since it is humans who evaluate the accuracy of the searched images, we need a color description that is tailored to human perception. Humans, when looking at an image, analyze it based on dominant colors [9]. An image with blue sky and desert, for example, would be analyzed by humans as an image composed of two main colors: blue (the sky) and yellow (the sand). That is, humans perceive the shades of yellow as a single color. To represent the color in the way humans interpret them, the dominant color descriptor (DCD) was proposed for describing color features [4]. By color quantization, the DCD represents an image using a small number of dominant colors, their weights in the image, Regular Paper This research was supported by the MSIP (Ministry of Science, ICT, and Future Planning), Korea, under the IT-CRSP (IT Convergence Research Support Program) with No. NIPA-2013-H supervised by the NIPA (National IT Industry Promotion Agency) and by the NRF (National Research Foundation) of Korea Grant funded by the Korean Government with No. NRF B This research was also supported by the Basic Science Research Program through the NRF funded by the Ministry of Education, Science and Technology of Korea under Grant Nos. 2012R1A1A and 2013R1A6A3A Corresponding Author 2014 Springer Science + Business Media, LLC & Science Press, China

2 Min-Hee Jang et al.: Approximate EMD in Linear Time 143 and several optional parameters. The image retrieval based on the DCD has shown a better performance than that based on the color histogram [8]. To measure the dissimilarity between a given image and images in a database, one needs a distance function. The distance function should be able to measure the following: 1) the dissimilarity between the dominant colors in two images, and 2) the dissimilarity between the weights of dominant colors in two images [9]. As a distance function that satisfies the above criteria, one may use the minimum work, the minimum amount of work needed to transform the color and weight distribution of one image to the distribution of another. The earth mover s distance (EMD) computes the optimal minimum work needed to transform one distribution into the other [8]. The work is defined as the multiplication of the weight of a color transferred and the distance between the transfers. Since a dominant color in one distribution can be transformed into a color in the other distribution in many different ways, finding the minimum work requires a large amount of processing time [8]. Several efforts have been suggested to reduce the computational cost. The EMD finds the minimum work by solving the transportation problem [8]. The complexity of the EMD is O(n 3 log n) where n is the number of the dominant colors. Optimal color composition distance (OCCD) is an approximate method that reduces the computational cost of finding the minimum work [9]. The OCCD quantizes the dominant colors into a set of m color units, each with the same weight, and finds an approximate minimum work by solving the minimumcost graph-matching problem. The time complexity of the OCCD is O(m 3 ). Color-based image retrieval in large databases requires a large number of image comparisons. Both the EMD and the OCCD are not suitable for large databases, because each image comparison requires high computational cost. Although a multidimensional index such as M-tree [10] can reduce the number of image comparisons, a significant number of image comparisons would still be required. For example, 500 to 700 comparisons are required for the database with images when the M-tree is used [10]. For large databases, we need a distance function that calculates the minimum work between two images in reasonable time. In this paper, we propose a distance function that computes an approximate EMD in linear time of the number of dominant colors. The proposed method works as follows. First, the proposed method creates a linear order of the dominant colors in each distribution. Second, the method calculates the work between the first colors in each distribution in order. The second step is repeated until all dominant colors are considered. The result is the minimum work between two distributions, computed in linear time. In Section 3, we prove that this process calculates the optimal minimum work between two distributions in the one-dimensional (1D) color space. To linearize the distribution of dominant colors in the multidimensional space, we use a space-filling curve. The space-filling curve linearizes multidimensional space [11]. Among the various space-filling curves, we use the Hilbert curve, one of the most widely used linearization methods [12]. After the linearization using the Hilbert curve, we repeat to select two colors nearest to the starting point of each distribution and compute the work between them. While the selection of two colors compared is based on the Hilbert order, the distance used in computing the work is the actual distance between them in the multidimensional space. The minimum work between two images is computed in O(n), because the dominant colors are scanned sequentially only once. However, there are two potential problems in the proposed method. First, the distance of two dominant colors close to each other in the multidimensional space may be measured distant in the Hilbert curve. As a result, the order of the dominant colors may not correctly represent the actual distance between them, and the minimum work may be calculated inaccurately. To solve the problem, we create multiple Hilbert curves by changing the starting point and the direction of the Hilbert curve. We calculate the minimum work for each of multiple Hilbert orders and select the minimum value as the minimum work between two images. Second, even when multiple Hilbert curves are used, there is a chance that the linearized order of dominant colors may not reflect the actual distance in multidimensional space. In particular, if two colors are on the opposite sides from the center of the multidimensional space, the distance between them is measured distant in the Hilbert curve. To solve this problem, we adjust the position of dominant colors to one of the vertices of the multidimensional space. The position adjustment may decrease the discrepancy between the actual distance of colors in multidimensional space and distance in the Hilbert curve. Since, the distance between the colors is preserved with the position adjustment, no additional error is introduced. After the position adjustment, we create multiple Hilbert curves, compute the minimum work for each Hilbert order and select the minimum value as the minimum work between two images. When adjusting the positions of dominant colors, we preserve the distance between the colors, and thus

3 144 J. Comput. Sci. & Technol., Jan. 2014, Vol.29, No.1 no additional error is introduced by the position adjustment. The proposed distance function calculates in linear time the minimum work that is almost the same as the EMD. To demonstrate the advantages of our method, we conduct an extensive set of experiments. The experimental results show that our method provides significant performance improvement over the EMD with small errors. It is also observed that the color-based image retrieval results of our method are comparable with the EMD. This paper is organized as follows. Section 2 defines the problem and reviews two existing methods. Section 3 describes our method in detail. Section 4 evaluates the performance of our method. Finally, Section 5 concludes this paper. minimum work is (for moving A) (for B) (for C) (for D). 2 Computation of Minimum Work 2.1 Problem Definition Let P = (p 1, w p1 ),..., (p m, w pm ) be the color-weight distribution of an image with m dominant colors, where p i is the dominant color and w pi is the weight of p i. Let Q = (q 1, w q1 ),..., (q n, w qn ) be the distribution of another image with n dominant colors. We assume the total weights of P and Q are the same. Let D = [d ij ] be the ground distance matrix, where d ij is the distance between p i and q j. The ground distance measures the dissimilarity between the dominant colors of two images and can be defined by any standard method, such as Euclidean distance or L 1 distance [8]. Let F = [f ij ] be the transferred weight, where f ij is the amount of weight transferred from p i to q j. The work between two colors, p i and q j, is defined as the multiplication of transferred weight f ij and ground distance d ij. The work between distributions P and Q is defined as: WORK (P, Q, F ) = m i=1 j=1 n d ij f ij. The minimum work between P and Q is defined as the minimum amount of work required to transform distribution P into distribution Q. That is, we want to find the flow, F = [f ij ], that minimizes the amount of work required to transform P into Q [8]. We use Fig.1 as an example to describe the computation of the minimum work between P and Q. In Fig.1, P represents the distribution of an image composed of three dominant colors, and Q represents the distribution of an image with two dominant colors. When transforming P into Q, the dominant colors in P can be transferred into the positions in Q in many different ways, but the amount of work would be minimal when each dominant color of P is transferred into the position denoted by the same alphabet in Q. That is, the Fig.1. Computation of the minimum work between two distributions. (a) Color-weight distribution P. (b) Color-weight distribution Q. The EMD computes the optimal minimum work, but requires a significant amount of processing time [8]. The processing time is as important as the accuracy in dissimilarity measures, especially for large databases. In this paper, we propose a new distance function that calculates an approximate minimum work in linear time. 2.2 Existing Methods Optimal Color Composition Distance The optimal color composition distance (OCCD) is an approximate method that reduces the computational cost of the minimum work. To compute the OCCD, dominant colors are quantized into a set of m color units, each with the same weight, w, where m w = 100. By solving the minimum-cost graphmatching problem on the undirected graph with m nodes (that represents m color units), the OCCD finds an approximate minimum work. When an undirected graph is given, the minimumcost matching problem finds the set of disjoint edges with the minimum cost [13]. The time complexity of the OCCD is O(m 3 ). The accuracy and time complexity of the OCCD are influenced by the weight of the color

4 Min-Hee Jang et al.: Approximate EMD in Linear Time 145 unit. The weight of the color unit has a fixed size, while the weight of each dominant color varies. When dominant colors are quantized into a set of color units, therefore, color loss occurs due to the discrepancy between the weight of dominant color and the color unit. With larger weights, the computation takes less time but the accuracy suffers. When the weight of the color unit is divided into the minimum unit, no color loss occurs, and the OCCD is able to calculate the optimal minimum work, while the complexity would be higher. It is recommended to use m = 20, w = 5 [9] Earth Mover s Distance In the EMD, the minimum work is computed based on the solution to the transportation problem. Suppose that suppliers need to provide goods to several consumers. Each supplier has a given amount of goods, each consumer has a given demand, and the cost of transporting a single unit of goods is given for each supplier-consumer pair. The transportation problem is then to find the least expensive flow of goods from suppliers to consumers that satisfies consumers demands [14]. The EMD defines that the dominant colors of one image as suppliers and those of the other image as consumers. The cost of a supplier-consumer pair is defined as the ground distance between the dominant colors. The minimum work can be calculated by solving the transportation problem. The complexity of the EMD is O(n 3 log n). Due to the high computational complexity, both the EMD and the OCCD require long retrieval time in content-based information retrieval. 2.3 EMD Variants For large databases, several efforts have been proposed to calculate the EMD between two data in reasonable time. In previous work, the EMD-L 1 and the wavelet EMD were proposed. The EMD-L 1 reduces the time complexity of the original EMD to O(n 2 ) [15]. However it requires the L 1 -distance instead of the usual Euclidean distance. The wavelet EMD was proposed as a linear-time approximate EMD computation [16]. The wavelet EMD, however, is applied to the histogram descriptor and not to the DCD. The wavelet EMD requires the dual function of the EMD, which cannot be created in the DCD. In the computer vision field, several researches on EMD have been conducted [17-18]. Ling and Okada proposed a new distance function for the fast and accurate shape matching [17]. In the shape matching, this distance function outperforms the EMD in terms of both accuracy and performance. Similar to EMD-L 1, however, it requires the L 1 -distance instead of standard Euclidean distance. Pele and Werman reduced the EMD computation time significantly between two histogram descriptors by using the threshold ground distance [18]. This method requires more than O(n 2 ) preprocessing time in each EMD computation when applying to the DCD. 3 Proposed Method 3.1 Basic Strategy In this section, we propose a new distance function that calculates an approximate EMD in linear time. In the following, we assume that dominant colors are ordered in 1D color space. The actual algorithm works in multidimensional space. First, the proposed method selects the color nearest to the starting point of the 1D space from each distribution and computes the work between two colors by multiplying the ground distance between two colors and the maximum common weight (MCW). The MCW is defined as the smaller of the weights of the two colors compared. Second, the proposed method removes the amount of MCW from the two colors. By doing so, the weights used to compute the work are removed from the distributions. Third, the above two steps are repeated until all dominant colors of two images are used to compute the work. Finally, adding all the work, the proposed method calculates the minimum work between two images. We use Fig.1 in Section 2 as an example. Fig.1 shows color-weight distributions of two images in the 1D color space. The weight of the first dominant color in P is 0.3, and the weight of the first dominant color in Q is 0.6. The MCW between these two colors is 0.3. After computing the work (0.3 2), the MCW of the compared colors are removed. That is, the first dominant color at position 2 in P is removed, and the half of the weight of the first dominant color at position 4 in Q is removed. At the next step, the colors nearest to the starting point (and their weights) are at position 3 in P (with the weight of 0.5) and at position 4 in Q (with the weight of 0.3). The MCW between these two colors is 0.3. The work is When repeating these processes in Fig.1, the minimum work is computed. In the 1D space, our method can compute the optimal minimum work between two images in O(n), where n is the number of dominant colors. The following is the proof. Lemma 1. Given two 1D distributions, P and Q, if we repeat to compute the distance of the maximum common weight between two colors nearest to the starting point, p first and q first, we can calculate the optimal minimum work between P and Q. Proof. Suppose instead of the color-weight distribution, an image is represented as the multiples of n

5 146 J. Comput. Sci. & Technol., Jan. 2014, Vol.29, No.1 colors. That is, if the weight of a color in the image is larger than the single unit, it is divided into multiple units of the same color. Then, two distributions are represented as follows: P = (p 1, p 2,..., p n ), Q = (q 1, q 2,..., q n ), where p i and q j represent a singleunit color. If the repeated computation of the work between p first and q first could not calculate the optimal minimum work, p 1 should use q j (where q j! = q 1 ) and q 1 should use p i (where p i! = p 1 ) to compute the work. In other words, the sum of the ground distances of (p 1, q j ) and (p i, q 1 ) will be less than or equal to the sum of the ground distances (p 1, q 1 ) and (p i, q j ). That is, D(p 1, q 1 ) + D(p i, q j ) D(p 1, q j ) + D(p i, q 1 ). (1) When p 1 q 1 q j p i, (2) and (3) should be satisfied. D(p 1, q j ) = D(p 1, q 1 ) + D(q 1, q j ), (2) D(p i, q 1 ) = D(p i, q j ) + D(q 1, q j ). (3) (1) and (2), (3) are contradictory. Therefore (4) should be satisfied. D(p 1, q 1 ) + D(p i, q j ) D(p 1, q j ) + D(p i, q 1 ). (4) As the lemma suggests, we calculate the EMD over multidimensional space in linear time. The same approaches based on Lemma 1 have been used in graph matching and histogram matching, however, they have never been applied to calculate the EMD for dominant color descriptors [19-20]. To apply this approach to image retrieval, we need to establish the order of dominant colors by linearizing the three-dimensional (3D) color space. In this paper, we use Hilbert curve, one of the most widely used spacefilling curves [11]. Fig.2 shows several examples of the Hilbert curve in two-dimensional (2D) space. As shown in the figure, the Hilbert curve fills the 2D space with a continuous curve. The Hilbert order of the dominant colors is created by this curve. With smaller granularity, multidimensional space is filled with a more complicated curve. Fig.2. Examples of the Hilbert curve. (a) Level 1. (b) Level 2. (c) Level 3. Because the color space, such as RGB and HSV, is 3D space, we use 3D Hilbert curve [11]. After the linearization, we compute the minimum work by above mentioned approach. Note that the order of dominant colors is created based on the Hilbert curve, but the actual ground distance between the dominant colors in 3D space is used when computing the work. 3.2 Improvements Note that our method may fail to calculate the optimal minimum work when the Hilbert order does not reflect the true proximity of the dominant colors in multidimensional space. It means the distance between two neighboring dominant colors in multidimensional space might be measured more distant by the Hilbert curve. Fig.3 shows such an example. Suppose P and Q are composed of two colors each, (p 1, p 2 ) and (q 1, q 2 ), respectively, and the weights of all colors are the same. The optimal minimum work between P and Q is calculated by the sum of the work between p 1 and q 1 and the work between p 2 and q 2, since the sum of D(p 1, q 1 ) and D(p 2, q 2 ) is shorter than the sum of D(p 1, q 2 ) and D(p 2, q 1 ). The Hilbert order of dominant colors in Fig.3(a), however, is (p 1, p 2, q 2, q 1 ) because D(p 1, q 1 ) is measured longer than D(p 1, q 2 ) in the Hilbert curve. Fig.3. Hilbert curves in 2D space. (a) Hilbert curve from the upper-left corner. (b) Hibert curve from the upper-right corner. This example shows the case where the Hilbert order fails to reflect the proximity of the dominant colors in multidimensional space. The order of (p 1, p 2, q 2, q 1 ) in Fig.3(a) does not reflect the proximity of (p 1, q 1 ) and (p 2, q 2 ). As a result, the minimum work would be calculated by the sum of the work between p 1 and q 2 and the work between p 2 and q 1, and our method cannot calculate the optimal minimum work between P and Q. This problem depends on the position of the dominant colors. If we change the starting point and the direction to create another Hilbert curve as in Fig.3(b), (q 1, p 1, p 2, q 2 ) is created. In this order, the optimal minimum work between P and Q is calculated by the

6 Min-Hee Jang et al.: Approximate EMD in Linear Time 147 sum of the work between p 1 and q 1 and the work between p 2 and q 2. With the Hilbert curve in Fig.3(b), we are able to extract the order that reflects the proximity of multidimensional space. Depending on the position of dominant colors in multidimensional space, the Hilbert order may or may not reflect the proximity between them. Therefore, we propose to use multiple Hilbert curves. In k-dimensional space, (2 k k!)/2 Hilbert curves can be created. It is proved in the following. Lemma 2. In linear k-dimensional space with finite granularity where each cell is a grid with the same size, (2 k k!)/2 Hilbert curves can be created to satisfy following conditions: 1) The Hilbert curve has to start from a vertex in k-dimensional space. 2) The Hilbert curve should be contained in k-dimensional space. 3) The Hilbert curve has to fill all grid cells in k-dimensional space. Proof. In k-dimensional space, 2 k vertices exist. From each vertex, k! Hilbert curves that have different directions can be created. Consequently, 2 k k! Hilbert curves can be created in k-dimensional space. Among those, half of them are the exact duplicates where the starting and the ending vertices of the Hilbert curve are flipped. When excluding the duplicates, (2 k k!)/2 Hilbert curves can be created in k-dimensional space. The conditions in Lemma 1 are required due to the following reasons. First, the Hilbert curve needs to reflect the true proximity of multidimensional space. If the Hilbert curve starts from the middle of multidimensional space, or exceed from multidimensional space, it is difficult to reflect the proximity. Second, the Hilbert curve has to fill all the cells in multidimensional space because the position of dominant colors varies according to given images. In 3D color space, 24 Hilbert curves can be created. We compute 24 minimum works based on the 24 orders and select the minimum value among them as the minimum work between two images. Even when multiple Hilbert curves are used, the orders of dominant colors might still be misarranged. Fig.4 shows an example where the proximity is not correctly reflected in the multiple Hilbert orders. In Fig.4, 2D space is divided into quadrants, and the dominant colors of P and Q are on the opposite sides of the boundary of quadrants I and II. In Fig.4(a), the work is calculated between p 1 and q 2 and p 2 and q 1, which results in inaccurate minimum work. This is because in most cases the distance of two colors that are on the opposite sides of the boundary of the quadrant is measured distant by the Hilbert curve. In Figs. 4(a) 4(d), all Hilbert curves calculate inaccurate minimum work. That is, even if multiple Hilbert curves are used, the distance between the colors near the boundary in the Hilbert order may not reflect the true proximity. Fig.4. Example where multiple Hilbert curves fail to reflect the proximity of dominant colors. (a) Heading south from the upperleft corner. (b) Heading west from the upper-left corner. (c) Heading east from the bottom-left corner. from the bottom-right corner. (d) Heading north To solve the problem, we adjust the position of dominant colors in multidimensional space. When calculating the minimum work between two images, what we need is the distance between dominant colors, not the precise position of them. As long as we preserve the distance between the colors, the minimum work between two images will not be affected even when we adjust the position of them. In this paper, we propose to adjust the position of dominant colors using the minimum bounding box (MBB) shift method. The MBB is defined as the smallest hyper-rectangle that contains all dominant colors in k-dimensional space. The position adjustment method finds the MBB and shifts it to one of the vertices in multidimensional space. In Fig.5(a), the Hilbert order, (q 2, q 1, p 1, p 2 ), does not reflect the proximity of the dominant colors. Fig.5(b) shows the position adjustment. As shown in the figure, the MBB shift puts the dominant colors into one quadrant (and makes them avoid the boundary of the quadrant). After the MBB shift, the Hilbert curve is able to find the correct order of (q 1, p 1, q 2, p 2 ) that reflects the proximity of multidimensional space.

7 148 J. Comput. Sci. & Technol., Jan. 2014, Vol.29, No.1 Fig.5. Example of MBB shift. (a) Original color distribution. (b) Color distribution after the MBB shift. Note that the difference between the distance in multidimensional space and the distance in the Hilbert curve increases with the increase in the granularity of the Hilbert curve. In Fig.2(a) in Subsection 3.1, for example, the distance of the Hilbert curve between the upper-left cell and upper-right cell is the same as the distance of multidimensional space. In Figs. 2(b) and 2(c), on the other hand, the difference between two distances grows with finer granularity. Also note that the Hilbert curve in the quadrant of a particular level is the same as the Hilbert curve of the previous level, as shown in Fig.2 in Subsection 3.1. These two facts indicate that if the MBB fits into a single quadrant (i.e., the entire dominant colors can be represented in the single quadrant), the level of the Hilbert curve decreases and the difference between two distances may decrease, which results in the proximity to be reflected correctly in the Hilbert curve in many cases. Even when the MBB is bigger than the quadrant, we can adjust the position of the MBB and get different Hilbert orders, which make it possible to reflect the proximity of colors. In this paper, we shift the MBB to the origin (e.g., RGB(0, 0, 0)). The Hilbert order may be different depending on which vertex the MBB is shifted to, and thus the approximate EMDs computed with different origins may be different. Since multiple Hilbert orders are created after the MBB shift, the experimental results (not reported in this paper) show that the impact of different origins is negligible. After the position adjustment, we create (2 k k!)/2 Hilbert orders. Consequently, 2 k k! orders (48 orders in 3D space) are created including the orders created using the original position of the dominant colors. We compute the approximate EMD of each order, and select the minimum value as the minimum work. Because generating 2 k k! orders requires a considerable time, we use mappingtable(origincoord[h][i][j], HilberPosition) for creating multiple Hilbert curves. The mappingtable receives as an input the position of the dominant colors in k-dimensional space as array (origincoord[h][i][j]), and maps its position in the Hilbert curve. Using the mapping table, the orders of the dominant colors are created with short memory access time. Also, if we use a parallel processing, the overhead of generating 2 k k! orders is only a little. Fig.6 shows the algorithm of the proposed method. Suppose that 2 k k! orders have already been created by mappingtable. The algorithm CalculateAEMD calculates the approximate EMD of each order. The dominant color nearest to the starting point from each distribution is selected (line 1), and the maximum common weight (MCW) between two colors is computed (line 3). If w qj is the MCW, the work is computed by multiplying the distance between two colors and w qj (line 4). The weights used to compute the work are removed from two distributions (lines 5 and 6). The algorithm selects the next dominant color q j+1 because w qj = 0 (line 7). If w qj is the MCW in line 3, the algorithm proceeds through line 9 to line 13. The algorithm repeats the above process until all dominant colors of two distributions are visited. Adding all the work, the algorithm calculates the approximate EMD between P and Q. Algorithm. CalculateAEMD(P, Q, AEMD) Input: P : distribution P = {(p 1, w p1, o p1 ),..., (p n, w pn, o pn)} Q: distribution Q = {(q 1, w q1, o q1 ),..., (q m, w qm, o qm)} /* p i and q i are dominant colors, w i is the weight of the dominant color, o i is the order in the Hilbert curve */ Output: AEMD: the approximate EMD between P and Q 1: IF o pi = o pfirst and o qj = o qfirst ; i = 1 2: WHILE E (i < n + m) 3: IF w pi > w qj 4: AEMD + = p + iq j w qj 5: w pi = w pi w qj 6: w qj = 0 7: q j + + 8: ELSE 9: AEMD + = p i q j w pi 10: w qj = w qj w pi 11: w pi = 0 12: p i : END IF 14: i : END WHILE 16: END IF 17: RETURN AEMD END CalculateAEMD Fig.6. CalculatedAEMD algorithm. Fig.7 shows the algorithm PositionAdjustment. This algorithm shifts the position of the dominant colors to the origin top left. In lines 1 3, the algorithm searches for top left x, top left y, and top left z, the x, y, z coordinates of the dominant color nearest to the origin

8 Min-Hee Jang et al.: Approximate EMD in Linear Time of the x, y, and z axes, respectively. In lines 4 6, the dominant colors are shifted to the origin by subtracting top left x, top left y, and top left z from the x, y, z coordinate of each dominant color. In line 7, the Hilbert order is created. In line 9, the algorithm calculates approximate EMD by calling the algorithm CalculateAEMD. Algorithm. PositionAdjustment(P, Q, AEMD) Input: P : distribution P = {(p1, wp1 ),..., (pn, wpn )} Q: distribution Q = {(q1, wq1 ),..., (qm, wqm )} Output: AEMD: the approximate EMD between P and Q 1: FOR each dominant color in P and Q 2: Search(top left x, top left y, top left z ) 3: END FOR 4: FOR each dominant color in P and Q 5: pi = (pix top left x, piy top left y, piz top left z ) 6: qj = (qjy top left x, qjy top left y, qiz top left z ) 7: Map pi and qj to positions opi and oqj in Hilbert curve 8: END FOR 9: CALL CalculateAEMD() 10: RETURN AEMD END PositionAdjustment Fig.7. PositionAdjustment algorithm Performance Evaluation Experimental Setup We perform our color-based image retrieval on the SIMPLIcity[21] and the shader data[22]. Some examples of these two kinds of data are shown in Fig.8 and Fig.9 respectively. The SIMPLIcity is composed of color images in 10 categories. Each category has 100 images. The data size of images is reasonable, since our experiments are designed to test the time complexity of the EMD for each image comparison (and not necessarily the scalability). As mentioned in Section 1, 500 to 700 comparisons are needed for image when the M-tree is used. Therefore, even if a larger dataset is used for scalability, we would have a similar number of image comparisons and would have Fig.8. Examples of SIMPLIcity images. 149 drawn the conclusion similar to the following experimental results. To extract the dominant colors of the SIMPLIcity, we apply the MPEG7 dominant color descriptor that extracts maximum 8 dominant colors of an image[23]. For the SiIMPLIcity, the MPEG7 dominant color descriptor extracts the dominant colors in the RGB space. The shader data, used for computer graphics, is the data that represent the features of a 3D object, such as colors, texture, and materials[22]. The shader has the exact color information as attributes. These attributes represent several tens to hundreds of colors in a single shader[22]. The colors of the shader can be represented by a small number of dominant colors and their weights, similar to the DCD. The number of dominant colors extracted from a shader can be adjusted. In our experiments, the shader data are used to measure the processing time and the accuracy of the proposed method depending on the number of dominant colors. We extract 5, 10, 15, 20, 25, and 30 dominant colors from shader data, respectively. We perform extensive experiments on the SIMPLIcity and the shader data to compare the processing time and the accuracy of the proposed method with the EMD, OCCD(20), and OCCD(100). The number in parentheses indicates the number of color units in OCCD. The 20 color units in OCCD(20) is the recommended number of the color unit in [9]. When the number of color unit is fixed at 20, the accuracy of the OCCD may suffer with the increase in the number of dominant colors. To measure the accuracy of the OCCD method fairly, we use both OCCD(20) and OCCD(100) in our experiments. MAPE = As an accuracy measure, we use the mean absolute percentage error (MAPE)[24]. In the MAPE, ai is the EMD value, and fi is the approximate value computed by the proposed or the OCCD methods. The MAPE represents the accuracy as a percentage. As a processing time measure, we use the time elapsed during query processing, while excluding the time for file access. For accurate experimental results, we use the average of 100 queries, where each query calculates the dissimilarity of a query image to images. The experiments are performed on a 2.80 GHZ PC equipped with a Windows 7 OS and 2GH of main memory. 4.2 Fig.9. Examples of shader data. n 1 X ai fi. n i=1 ai Experimental Results In the first set of experiments, we evaluate the preprocessing time of the proposed method. The prepro-

9 150 J. Comput. Sci. & Technol., Jan. 2014, Vol.29, No.1 Table 1. Preprocessing Time of the Proposed Method Elapsed time (s) cessing time in Table 1 shows the time to create the mapping table. The preprocessing time increases as the granularity of the multidimensional space becomes finer. This is because the size of the mapping table increases with the finer granularity of the multidimensional space. Note that the mapping table can be used without additional overheads once created, since preprocessing can be done offline. Also, the mapping table maps the dominant colors to the order of the Hilbert curve directly, working as a hash table. Therefore, the performance of the proposed method does not degrade with finer granularity. In the second set of experiments, we measure the processing time and the accuracy of the proposed method while varying the number of the orders used. We use the SIMPLIcity images. Table 2 shows the experimental results. In 3D color space, 24 Hilbert curves can be created at maximum, by creating 3 Hilbert curves from each vertex of the space. In Table 2, MHC(8) uses 8 orders by creating a Hilbert curve from each vertex. Likewise, MHC(16) and MHC(24) use 16 and 24 orders by creating 2 and 3 Hilbert curves from each vertex, respectively. After creating 24 orders from the original position of the dominant colors, MHC(shift) shifts the position of dominant colors to the origin and used the additional 24 orders created by position adjustment. Table 2. Performances of the Proposed Method While Varying the Number of Orders Used MAPE (%) Time (s) MHC(1) MHC(8) MHC(16) MHC(24) MHC(shift) As shown in Table 2, when a small number of Hilbert curves are used, the accuracy of the proposed method is low. The accuracy improves with the increase in the number of Hilbert orders used. This is because whether the Hilbert order reflects the true proximity of the multidimensional space depends on the position of dominant colors. Therefore, the true proximity is more likely to be achieved with the increase in the number of Hilbert orders used. The accuracy does not improve much from MHC(16) to MHC(24) for two reasons. First the problem near the boundary of the quadrant is more prominent with finer granularity. Second, most of 24 orders have similar sequences since they are created from the same position of the dominant colors. These problems can be solved by the position adjustment. The accuracy of the MHC(shift) is better than MHC(24). The elapsed time of the proposed method, although it increases linearly according to the number of the orders, is very short. The image retrieval can be done in a short time even when MHC(shift) is used. In the following, we use the MHC(shift). In the third set of experiments, we compare the processing time and the accuracy of the proposed method with the EMD, OCCD(20), and OCCD(100). In the experiments, the images from SIMPLIcity are used. As shown in Table 3, the proposed method shows significantly better performance with small errors. The proposed method exhibits 4.2% MAPE, while shortens the image-retrieval time 34 times over the EMD. The proposed method calculates the approximate minimum work with an acceptable error bound in a very short time. OCCD(20) has 4.3% MAPE, similar to the MHC(shift), but the processing time of OCCD(20) is similar to that of EMD. Because the weight of the color unit is different from the weight of the dominant color, the error occurs in the OCCD. Also, the OCCD requires a considerable processing time to find the optimal color unit pair that satisfies the approximate minimum work. As the weight of the color unit gets smaller, the accuracy increases, but the performance suffers. OCCD(100), although the MAPE is only 1.2%, exhibits the elapsed time of seconds, too long to be used in any practical image retrieval or search applications. Table 3. Performance Comparison MAPE (%) Time (s) MHC (shift) EMD OCCD (20) OCCD (100) Fig.10 shows the correlation between the processing time and the accuracy. We calculate the dissimilarities of 30 randomly-generated query images to 100 SIM- PLIcity images. Each point in Fig.10 is the result of each query. The x-axis is the elapsed time to calculate the dissimilarity with 100 images, and the y-axis is the average of the 100 accuracy results. As shown in Fig.10, the accuracy of the EMD is 100% in all points and the elapsed time of the EMD is between 67 ms and 91 ms. The accuracy of OCCD(20)

10 Min-Hee Jang et al.: Approximate EMD in Linear Time 151 Fig.10. Time vs accuracy. is similar to that of MHC(shift), but the elapsed time is similar to that of the EMD. OCCD(100) shows the high accuracy between 98.1% and 99.6%, but the elapsed time is to ms. In comparison, MHC(shift) shows both good performance and high accuracy. The accuracy of MHC(shift) is 95.8% to 98.3%, and the elapsed time is between 1.7 ms and 3.1 ms. In Fig.11, we examine the processing time and the accuracy with respect to the varying number of dominant colors. Fig.11(a) shows the accuracy. The EMD is excluded in Fig.11(a), because it always finds the optimal minimum work. The accuracy of all methods decreases when the number of the dominant color increases. In particular, the accuracy of OCCD(20) decreases rapidly. In OCCD, the weight of the dominant color gets smaller as the number of dominant color increases, but the weight of the color is fixed. Therefore, the error in OCCD(20) increases rapidly with the increase in the number of dominant colors. OCCD(100) has a small error compared with OCCD(20). The error increases as the number of dominant colors increases. With the increase in the number of dominant colors, the error of MHC(shift) also increases, but the rate of error growth decreases. The MAPEs of the proposed method are 0.9%, 4.4%, and 5.3% when 5, 10, and 15 dominant colors are used, respectively. When 20, 25, and 30 dominant colors are used, however, the MAPEs of the proposed method are 5.9%, 6.4% and 6.8%. That is, the rate of error growth decreases. The results indicate the proposed method reflects well the proximity in multidimensional space even when the number of dominant colors increases. Fig.11(b) shows the retrieval time with respect to the varying number of dominant colors. The processing time of the EMD increases rapidly in proportion to the number of dominant colors. When 5 dominant colors are used, the elapsed time of the EMD is seconds, and when 30 dominant colors are used, the elapsed time is seconds. Both OCCD(20) and Fig.11. Experimental results with respect to the varying number of dominant colors. (a) Accuracy. (b) Elapsed time. OCCD(100) show constant elapsed time, because they use the equal number of color units in all conditions. The elapsed time of MHC(shift) increases according to the number of dominant colors. The rate of increase, however, decreases, similar to its rate of error growth. The elapsed time of MHC(shift) is and 0.09 seconds when 5 dominant colors and 30 dominant colors are used, respectively. Compared with the EMD, MHC(shift) improves the search time about 31 to 65 times. In the next set of experiments, we examine the precision and the recall of the search on SIMPLIcity to measure the impact of errors in real world image retrieval. We use the average of 100 queries, where each query retrieved 10, 20, 30, 50, and 100 images, and calculated the precision and the recall. As shown in Table 4, all methods exhibit similar results. This is because errors are rare in approximate methods. Both the proposed method and the OCCD show almost the same results with the EMD. Fig.12 shows the precision versus recall curves of all methods. As shown in the figure, all methods have the similar precision versus recall tendency.

11 152 J. Comput. Sci. & Technol., Jan. 2014, Vol.29, No.1 Table 4. Precision and Recall Number of Retrieved Images MHC(shift) Precision (%) EMD OCCD(20) OCCD(100) Fig.12. Precision vs recall distribution. Fig.13 shows an example of search results of each method. Each figure displays eight images closest to the query image 302.jpg, retrieved from the SIMPLIcity. The number below the image name represents the minimum work computed by each method. The images are ordered in terms of similarity. All methods retrieve bus images similar to the query image. MHC(shift) retrieves 55.jpg that is different from the query. The EMD retrieves 269.jpg that is different from the query image. OCCD(20) retrieves 269.jpg incorrectly, and OCCD(100) retrieves 269.jpg and 55.jpg incorrectly. This is because the compositions of dominant colors in 269.jpg and 55.jpg are similar to the query image. The overall tendency of the retrieved images is similar in all methods. MHC(shift) Recall (%) EMD OCCD(20) OCCD(100) Conclusions and Future work In this paper, we proposed a new distance function that calculates the approximate EMD in linear time. For the image retrieval based on the dominant color descriptor, the earth mover s distance and the optimal color composition distance have been used as the dissimilarity measure. They provide high-quality results in image retrieval but require high computational cost. Our distance function linearizes the multidimensional color space using the Hilbert curve, thereby calculates the approximate minimum work in linear time. To improve the accuracy, the distance function adopts two improvements: the use of multiple Hilbert orders and the position adjustment. The extensive experiments reveal that our approach achieves the order-ofmagnitude improvement on processing time and that the rate of error growth decreases as the number of the dominant color increases. The retrieval results of the proposed method in real world images are almost the same as those of the EMD, which indicates the error of the proposed method does not influence the quality of image retrieval results. Given O(n) processing time and small errors, our distance function can be used effectively in image retrieval for large databases. As future work, we would like to extend our work to index in a high-dimensional space. Recently, several studies have been proposed to reduce the number of EMD comparisons in a high-dimensional space[25-27]. Since the post-processing in these researches needs a considerable number of EMD comparisons, they require high retrieval time. To solve the problem, we are in the process of developing an indexing scheme based on our approximate EMD in a high-dimensional space. References Fig.13. Retrieval results of each method. (a) MHC(shift). (b) EMD. (c) OCCD(20). (d) OCCD(100). [1] Lang H, Wang B, Jones G et al. Query performance prediction for information retrieval based on covering topic score. Journal of Computer Science and Technology, 2008, 23(4): [2] Liu Y, Zhang D, Lu G et al. A survey of content-based image retrieval with high-level semantics. Pattern Recognition, 2007, 40(1): [3] Yan R, Hsu W. Recent developments in content-based and concept-based image/video retrieval. In Proc. ACM Int. Conf. Multimedia, Oct. 2008, pp

12 Min-Hee Jang et al.: Approximate EMD in Linear Time 153 [4] Schwartz W, Kembhavi A, Harwood D et al. Human detection using partial least squares analysis. In Proc. the 12th IEEE Int. Conf. Computer Vision, Sept. 29-Oct. 2, 2009, pp [5] Thang N, Rasheed T, Lee Y et al. Content-based facial image retrieval using constrained independent component analysis. Information Sciences, 2011, 181(15): [6] Ajorloo H, Lakdashti A. HBIR: Hypercube-based image retrieval. Journal of Computer Science and Technology, 2012, 27(1): [7] van de Weijer J, Schmid C. Coloring local feature extraction. In Proc. the 9th European Conference on Computer Vision, May 2006, pp [8] Rubner Y, Tomasi C, Guibas L. The earth mover s distance as a metric for image retrieval. International Journal of Computer Vision, 2000, 40(2): [9] Mojsilovic A, Hu J, Soljanin E. Extraction of perceptually important colors and similarity measurement for image matching, retrieval, and analysis. IEEE Transactions on Image Processing, 2002, 11(11): [10] Ciaccia P, Patella M, Zezula P. M-tree: An efficient access method for similarity search in metric spaces. In Proc. the 23rd Int. Conf. Very Large Data Bases, Aug. 1997, pp [11] Moon B, Jagadish H, Faloutsos C et al. Analysis of the clustering properties of the Hilbert space-filling curve. IEEE Transactions on Knowledge and Data Engineering, 2001, 13(1): [12] Jagadish H. Analysis of the Hilbert curve for representing two-dimensional space. Information Processing Letters, 1997, 62(1): [13] Lawler E. Combinatorial Optimization: Networks and Matroids. New York: Courier Dover Publications, [14] Rachev S. The Monge-Kantorovich mass transference problem and its stochastic applications. Theory of Probability and Its Applications, 1984, 29(4): [15] Ling H, Okada K. An efficient earth mover s distance algorithm for robust histogram comparison. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 29(5): [16] Shirdhonkar S, Jacobs D. Approximate earth mover s distance in linear time. In Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, June [17] Ling H, Okada K. Diffusion distance for histogram comparison. In Proc. the IEEE Conf. Computer Vision and Pattern Recognition, June 2006, pp [18] Pele O, Werman M. Fast and robust earth mover s distances. In Proc. the 12th IEEE Conf. Computer Vision, Sept. 29- Oct. 2, 2009, pp [19] Werman M, Peleg S, Melter R et al. Bipartite graph matching for points on a line or a circle. Journal of Algorithms, 1986, 7(2): [20] Werman M, Peleg S, Rosenfeld A. A distance metric for multidimensional histograms. Computer Vision, Graphics, and Image Processing, 1985, 32(3): [21] Wang J, Li J, Wiederholdy G. SIMPLIcity: Semanticssensitive integrated matching for picture libraries. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(9): [22] AutoDesk Maya Press. Learning Autodesk Maya 2009: The Special Effects Handbook. Wiley, [23] Martnez J. MPEG-7 overview. standards/mpeg-7, Nov [24] Shepperd M, Schofield C. Estimating software project effort using analogies. IEEE Transactions on Software Engineering, 1997, 23(11): [25] Assent I, Wenning A, Seidl T. Approximation techniques for indexing the earth mover s distance in multimedia databases. In Proc. IEEE Int. Conf. Data Engineering, Apr [26] Wichterich M, Assent I, Kranen P et al. Efficient EMD-based similarity search in multimedia databases via flexible dimensionality reduction. In Proc. ACM SIGMOD Int. Conf. Management of Data, June 2008, pp [27] Xu J, Zhang Z, Tung A et al. Efficient and effective similarity search over probabilistic data based on earth mover s distance. Proc. of the VLDB Endowment, 2010, 3(1/2): Min-Hee Jang earned the M.S. and Ph.D. degrees in electronics and computer engineering from Hanyang University, Korea, at 2006 and 2012, respectively. At 2012, he worked with the Embedded Software Research Center, Hanyang University as a senior engineer. In 2013, he joined Carnegie Mellon University (CMU), Pittsburgh, USA, where he currently is a postdoctoral researcher at the Computer Science Department. He also visited the Computer Science Department of CMU as a visiting researcher in His research interests include data mining, multimedia information retrieval, and social network analysis. Sang-Wook Kim received the B.S. degree in computer engineering from Seoul National University, Korea, at 1989, and earned the M.S. and Ph.D. degrees in computer science from Korea Advanced Institute of Science and Technology (KAIST), at 1991 and 1994, respectively. From 1994 to 1995, he worked with the Information and Electronics Research Center, KAIST, as a senior engineer. From 1995 to 2003, he served as an associate professor of the Division of Computer, Information, and Communications Engineering at Kangwon National University, Korea. In 2003, he joined Hanyang University, Seoul, where he currently is a professor at the Department of Electronics and Computer Engineering. From 2009 to 2010, he visited the Computer Science Department at Carnegie Mellon University as a visiting professor. From 1999 to 2000, he worked with the IBM T. J. Watson Research Center, USA, as a postdoctoral researcher. He also visited the Computer Science Department of Stanford University as a visiting researcher in He is an author of over 100 papers in refereed international journals and international conference proceedings. His research interests include databases, data mining, multimedia information retrieval, social network analysis, recommendation, and web data analysis. He is a member of the ACM and the IEEE.

13 154 J. Comput. Sci. & Technol., Jan. 2014, Vol.29, No.1 Christos Faloutsos is a professor at Carnegie Mellon University. He has received the Presidential Young Investigator Award by the National Science Foundation (1989), the Research Contributions Award in ICDM 2006, the SIGKDD Innovations Award (2010), 19 best paper awards (including two test of time awards), and 4 teaching awards. He is an ACM Fellow. He has served as a member of the executive committee of SIGKDD; he has published over 200 refereed articles, 11 book chapters and one monograph. He holds 6 patents and he has given over 30 tutorials and over 10 invited distinguished lectures. His research interests include data mining for graphs and streams, fractals, database performance, and indexing for multimedia and bio-informatics data. Sunju Park is a professor of operations, decisions and information at the School of Business at Yonsei University, Seoul. Her education includes a B.S. and M.S. degrees in computer engineering from Seoul National University and a Ph.D. degree in computer science and engineering from the University of Michigan. Before joining Yonsei University, she has served on the faculties of management science and information systems at Rutgers University. Her research interests include analysis of online social networks, multiagent systems for online businesses, and pricing of network resources. Her publications have appeared in Computers and Industrial Engineering, Electronic Commerce Research, Transportation Research, IIE Transactions, European Journal of Operational Research, Journal of Artificial Intelligence Research, Interfaces, Autonomous Agents and Multi- Agent Systems, and other leading journals.

Fast and Robust Earth Mover s Distances

Fast and Robust Earth Mover s Distances Fast and Robust Earth Mover s Distances Ofir Pele and Michael Werman School of Computer Science and Engineering The Hebrew University of Jerusalem {ofirpele,werman}@cs.huji.ac.il Abstract We present a

More information

AN EFFICIENT BATIK IMAGE RETRIEVAL SYSTEM BASED ON COLOR AND TEXTURE FEATURES

AN EFFICIENT BATIK IMAGE RETRIEVAL SYSTEM BASED ON COLOR AND TEXTURE FEATURES AN EFFICIENT BATIK IMAGE RETRIEVAL SYSTEM BASED ON COLOR AND TEXTURE FEATURES 1 RIMA TRI WAHYUNINGRUM, 2 INDAH AGUSTIEN SIRADJUDDIN 1, 2 Department of Informatics Engineering, University of Trunojoyo Madura,

More information

A Content Based Image Retrieval System Based on Color Features

A Content Based Image Retrieval System Based on Color Features A Content Based Image Retrieval System Based on Features Irena Valova, University of Rousse Angel Kanchev, Department of Computer Systems and Technologies, Rousse, Bulgaria, Irena@ecs.ru.acad.bg Boris

More information

Appearance-Based Place Recognition Using Whole-Image BRISK for Collaborative MultiRobot Localization

Appearance-Based Place Recognition Using Whole-Image BRISK for Collaborative MultiRobot Localization Appearance-Based Place Recognition Using Whole-Image BRISK for Collaborative MultiRobot Localization Jung H. Oh, Gyuho Eoh, and Beom H. Lee Electrical and Computer Engineering, Seoul National University,

More information

A Real Time GIS Approximation Approach for Multiphase Spatial Query Processing Using Hierarchical-Partitioned-Indexing Technique

A Real Time GIS Approximation Approach for Multiphase Spatial Query Processing Using Hierarchical-Partitioned-Indexing Technique International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2017 IJSRCSEIT Volume 2 Issue 6 ISSN : 2456-3307 A Real Time GIS Approximation Approach for Multiphase

More information

Proximity Prestige using Incremental Iteration in Page Rank Algorithm

Proximity Prestige using Incremental Iteration in Page Rank Algorithm Indian Journal of Science and Technology, Vol 9(48), DOI: 10.17485/ijst/2016/v9i48/107962, December 2016 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Proximity Prestige using Incremental Iteration

More information

GEMINI GEneric Multimedia INdexIng

GEMINI GEneric Multimedia INdexIng GEMINI GEneric Multimedia INdexIng GEneric Multimedia INdexIng distance measure Sub-pattern Match quick and dirty test Lower bounding lemma 1-D Time Sequences Color histograms Color auto-correlogram Shapes

More information

Perceptual Quality Improvement of Stereoscopic Images

Perceptual Quality Improvement of Stereoscopic Images Perceptual Quality Improvement of Stereoscopic Images Jong In Gil and Manbae Kim Dept. of Computer and Communications Engineering Kangwon National University Chunchon, Republic of Korea, 200-701 E-mail:

More information

Tree Based Index (TBI) System. Getting Started with TBI

Tree Based Index (TBI) System. Getting Started with TBI Tree Based Index (TBI) System Getting Started with TBI Jia Xu 1 Zhenjie Zhang 2 Anthony K. H. Tung 2 Ge Yu 1 1 {xujia,yuge}@ise.neu.edu.cn 2 {zhenjie,atung}@comp.nus.edu.sg May 5, 2010 1 System Introduction

More information

Image retrieval based on bag of images

Image retrieval based on bag of images University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2009 Image retrieval based on bag of images Jun Zhang University of Wollongong

More information

Effective Pattern Similarity Match for Multidimensional Sequence Data Sets

Effective Pattern Similarity Match for Multidimensional Sequence Data Sets Effective Pattern Similarity Match for Multidimensional Sequence Data Sets Seo-Lyong Lee, * and Deo-Hwan Kim 2, ** School of Industrial and Information Engineering, Hanu University of Foreign Studies,

More information

SOME stereo image-matching methods require a user-selected

SOME stereo image-matching methods require a user-selected IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 3, NO. 2, APRIL 2006 207 Seed Point Selection Method for Triangle Constrained Image Matching Propagation Qing Zhu, Bo Wu, and Zhi-Xiang Xu Abstract In order

More information

Earth Mover s Distance and The Applications

Earth Mover s Distance and The Applications Earth Mover s Distance and The Applications Hu Ding Computer Science and Engineering, Michigan State University The Motivations It is easy to compare two single objects: the pairwise distance. For example:

More information

A Spatial Point Pattern Analysis to Recognize Fail Bit Patterns in Semiconductor Manufacturing

A Spatial Point Pattern Analysis to Recognize Fail Bit Patterns in Semiconductor Manufacturing A Spatial Point Pattern Analysis to Recognize Fail Bit Patterns in Semiconductor Manufacturing Youngji Yoo, Seung Hwan Park, Daewoong An, Sung-Shick Shick Kim, Jun-Geol Baek Abstract The yield management

More information

Leveraging Set Relations in Exact Set Similarity Join

Leveraging Set Relations in Exact Set Similarity Join Leveraging Set Relations in Exact Set Similarity Join Xubo Wang, Lu Qin, Xuemin Lin, Ying Zhang, and Lijun Chang University of New South Wales, Australia University of Technology Sydney, Australia {xwang,lxue,ljchang}@cse.unsw.edu.au,

More information

Lecture 6: Multimedia Information Retrieval Dr. Jian Zhang

Lecture 6: Multimedia Information Retrieval Dr. Jian Zhang Lecture 6: Multimedia Information Retrieval Dr. Jian Zhang NICTA & CSE UNSW COMP9314 Advanced Database S1 2007 jzhang@cse.unsw.edu.au Reference Papers and Resources Papers: Colour spaces-perceptual, historical

More information

Graph-based High Level Motion Segmentation using Normalized Cuts

Graph-based High Level Motion Segmentation using Normalized Cuts Graph-based High Level Motion Segmentation using Normalized Cuts Sungju Yun, Anjin Park and Keechul Jung Abstract Motion capture devices have been utilized in producing several contents, such as movies

More information

Wavelet Based Image Retrieval Method

Wavelet Based Image Retrieval Method Wavelet Based Image Retrieval Method Kohei Arai Graduate School of Science and Engineering Saga University Saga City, Japan Cahya Rahmad Electronic Engineering Department The State Polytechnics of Malang,

More information

Robot localization method based on visual features and their geometric relationship

Robot localization method based on visual features and their geometric relationship , pp.46-50 http://dx.doi.org/10.14257/astl.2015.85.11 Robot localization method based on visual features and their geometric relationship Sangyun Lee 1, Changkyung Eem 2, and Hyunki Hong 3 1 Department

More information

Query-Sensitive Similarity Measure for Content-Based Image Retrieval

Query-Sensitive Similarity Measure for Content-Based Image Retrieval Query-Sensitive Similarity Measure for Content-Based Image Retrieval Zhi-Hua Zhou Hong-Bin Dai National Laboratory for Novel Software Technology Nanjing University, Nanjing 2193, China {zhouzh, daihb}@lamda.nju.edu.cn

More information

An Efficient Approach for Color Pattern Matching Using Image Mining

An Efficient Approach for Color Pattern Matching Using Image Mining An Efficient Approach for Color Pattern Matching Using Image Mining * Manjot Kaur Navjot Kaur Master of Technology in Computer Science & Engineering, Sri Guru Granth Sahib World University, Fatehgarh Sahib,

More information

Short Run length Descriptor for Image Retrieval

Short Run length Descriptor for Image Retrieval CHAPTER -6 Short Run length Descriptor for Image Retrieval 6.1 Introduction In the recent years, growth of multimedia information from various sources has increased many folds. This has created the demand

More information

An Area-Efficient BIRA With 1-D Spare Segments

An Area-Efficient BIRA With 1-D Spare Segments 206 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 26, NO. 1, JANUARY 2018 An Area-Efficient BIRA With 1-D Spare Segments Donghyun Kim, Hayoung Lee, and Sungho Kang Abstract The

More information

Shape Descriptor using Polar Plot for Shape Recognition.

Shape Descriptor using Polar Plot for Shape Recognition. Shape Descriptor using Polar Plot for Shape Recognition. Brijesh Pillai ECE Graduate Student, Clemson University bpillai@clemson.edu Abstract : This paper presents my work on computing shape models that

More information

Distribution Distance Functions

Distribution Distance Functions COMP 875 November 10, 2009 Matthew O Meara Question How similar are these? Outline Motivation Protein Score Function Object Retrieval Kernel Machines 1 Motivation Protein Score Function Object Retrieval

More information

Improving the Efficiency of Fast Using Semantic Similarity Algorithm

Improving the Efficiency of Fast Using Semantic Similarity Algorithm International Journal of Scientific and Research Publications, Volume 4, Issue 1, January 2014 1 Improving the Efficiency of Fast Using Semantic Similarity Algorithm D.KARTHIKA 1, S. DIVAKAR 2 Final year

More information

Topology-Preserved Diffusion Distance for Histogram Comparison

Topology-Preserved Diffusion Distance for Histogram Comparison Topology-Preserved Diffusion Distance for Histogram Comparison Wang Yan, Qiqi Wang, Qingshan Liu, Hanqing Lu, and Songde Ma National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy

More information

Open Access Self-Growing RBF Neural Network Approach for Semantic Image Retrieval

Open Access Self-Growing RBF Neural Network Approach for Semantic Image Retrieval Send Orders for Reprints to reprints@benthamscience.ae The Open Automation and Control Systems Journal, 2014, 6, 1505-1509 1505 Open Access Self-Growing RBF Neural Networ Approach for Semantic Image Retrieval

More information

Near Optimal Repair Rate Built-in Redundancy Analysis with Very Small Hardware Overhead

Near Optimal Repair Rate Built-in Redundancy Analysis with Very Small Hardware Overhead Near Optimal Repair Rate Built-in Redundancy Analysis with Very Small Hardware Overhead Woosung Lee, Keewon Cho, Jooyoung Kim, and Sungho Kang Department of Electrical & Electronic Engineering, Yonsei

More information

HOT asax: A Novel Adaptive Symbolic Representation for Time Series Discords Discovery

HOT asax: A Novel Adaptive Symbolic Representation for Time Series Discords Discovery HOT asax: A Novel Adaptive Symbolic Representation for Time Series Discords Discovery Ninh D. Pham, Quang Loc Le, Tran Khanh Dang Faculty of Computer Science and Engineering, HCM University of Technology,

More information

/$ IEEE

/$ IEEE IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 56, NO. 1, JANUARY 2009 81 Bit-Level Extrinsic Information Exchange Method for Double-Binary Turbo Codes Ji-Hoon Kim, Student Member,

More information

A Review: Content Base Image Mining Technique for Image Retrieval Using Hybrid Clustering

A Review: Content Base Image Mining Technique for Image Retrieval Using Hybrid Clustering A Review: Content Base Image Mining Technique for Image Retrieval Using Hybrid Clustering Gurpreet Kaur M-Tech Student, Department of Computer Engineering, Yadawindra College of Engineering, Talwandi Sabo,

More information

Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds

Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds 9 1th International Conference on Document Analysis and Recognition Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds Weihan Sun, Koichi Kise Graduate School

More information

MATRIX BASED INDEXING TECHNIQUE FOR VIDEO DATA

MATRIX BASED INDEXING TECHNIQUE FOR VIDEO DATA Journal of Computer Science, 9 (5): 534-542, 2013 ISSN 1549-3636 2013 doi:10.3844/jcssp.2013.534.542 Published Online 9 (5) 2013 (http://www.thescipub.com/jcs.toc) MATRIX BASED INDEXING TECHNIQUE FOR VIDEO

More information

Quaternion-based color difference measure for removing impulse noise in color images

Quaternion-based color difference measure for removing impulse noise in color images 2014 International Conference on Informative and Cybernetics for Computational Social Systems (ICCSS) Quaternion-based color difference measure for removing impulse noise in color images Lunbo Chen, Yicong

More information

Hierarchical GEMINI - Hierarchical linear subspace indexing method

Hierarchical GEMINI - Hierarchical linear subspace indexing method Hierarchical GEMINI - Hierarchical linear subspace indexing method GEneric Multimedia INdexIng DB in feature space Range Query Linear subspace sequence method DB in subspace Generic constraints Computing

More information

A Miniature-Based Image Retrieval System

A Miniature-Based Image Retrieval System A Miniature-Based Image Retrieval System Md. Saiful Islam 1 and Md. Haider Ali 2 Institute of Information Technology 1, Dept. of Computer Science and Engineering 2, University of Dhaka 1, 2, Dhaka-1000,

More information

Accelerating Pattern Matching or HowMuchCanYouSlide?

Accelerating Pattern Matching or HowMuchCanYouSlide? Accelerating Pattern Matching or HowMuchCanYouSlide? Ofir Pele and Michael Werman School of Computer Science and Engineering The Hebrew University of Jerusalem {ofirpele,werman}@cs.huji.ac.il Abstract.

More information

A Comparison of SIFT, PCA-SIFT and SURF

A Comparison of SIFT, PCA-SIFT and SURF A Comparison of SIFT, PCA-SIFT and SURF Luo Juan Computer Graphics Lab, Chonbuk National University, Jeonju 561-756, South Korea qiuhehappy@hotmail.com Oubong Gwun Computer Graphics Lab, Chonbuk National

More information

A Novel Extreme Point Selection Algorithm in SIFT

A Novel Extreme Point Selection Algorithm in SIFT A Novel Extreme Point Selection Algorithm in SIFT Ding Zuchun School of Electronic and Communication, South China University of Technolog Guangzhou, China zucding@gmail.com Abstract. This paper proposes

More information

Express Letters. A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation. Jianhua Lu and Ming L. Liou

Express Letters. A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation. Jianhua Lu and Ming L. Liou IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 7, NO. 2, APRIL 1997 429 Express Letters A Simple and Efficient Search Algorithm for Block-Matching Motion Estimation Jianhua Lu and

More information

Face Alignment Under Various Poses and Expressions

Face Alignment Under Various Poses and Expressions Face Alignment Under Various Poses and Expressions Shengjun Xin and Haizhou Ai Computer Science and Technology Department, Tsinghua University, Beijing 100084, China ahz@mail.tsinghua.edu.cn Abstract.

More information

CONTENT BASED IMAGE RETRIEVAL SYSTEM USING IMAGE CLASSIFICATION

CONTENT BASED IMAGE RETRIEVAL SYSTEM USING IMAGE CLASSIFICATION International Journal of Research and Reviews in Applied Sciences And Engineering (IJRRASE) Vol 8. No.1 2016 Pp.58-62 gopalax Journals, Singapore available at : www.ijcns.com ISSN: 2231-0061 CONTENT BASED

More information

On Biased Reservoir Sampling in the Presence of Stream Evolution

On Biased Reservoir Sampling in the Presence of Stream Evolution Charu C. Aggarwal T J Watson Research Center IBM Corporation Hawthorne, NY USA On Biased Reservoir Sampling in the Presence of Stream Evolution VLDB Conference, Seoul, South Korea, 2006 Synopsis Construction

More information

SQL-to-MapReduce Translation for Efficient OLAP Query Processing

SQL-to-MapReduce Translation for Efficient OLAP Query Processing , pp.61-70 http://dx.doi.org/10.14257/ijdta.2017.10.6.05 SQL-to-MapReduce Translation for Efficient OLAP Query Processing with MapReduce Hyeon Gyu Kim Department of Computer Engineering, Sahmyook University,

More information

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers A. Salhi, B. Minaoui, M. Fakir, H. Chakib, H. Grimech Faculty of science and Technology Sultan Moulay Slimane

More information

Robust PDF Table Locator

Robust PDF Table Locator Robust PDF Table Locator December 17, 2016 1 Introduction Data scientists rely on an abundance of tabular data stored in easy-to-machine-read formats like.csv files. Unfortunately, most government records

More information

Semantics-based Image Retrieval by Region Saliency

Semantics-based Image Retrieval by Region Saliency Semantics-based Image Retrieval by Region Saliency Wei Wang, Yuqing Song and Aidong Zhang Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY 14260, USA

More information

Page Mapping Scheme to Support Secure File Deletion for NANDbased Block Devices

Page Mapping Scheme to Support Secure File Deletion for NANDbased Block Devices Page Mapping Scheme to Support Secure File Deletion for NANDbased Block Devices Ilhoon Shin Seoul National University of Science & Technology ilhoon.shin@snut.ac.kr Abstract As the amount of digitized

More information

Moving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial Region Segmentation

Moving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial Region Segmentation IJCSNS International Journal of Computer Science and Network Security, VOL.13 No.11, November 2013 1 Moving Object Segmentation Method Based on Motion Information Classification by X-means and Spatial

More information

Non-Linear Masking based Contrast Enhancement via Illumination Estimation

Non-Linear Masking based Contrast Enhancement via Illumination Estimation https://doi.org/10.2352/issn.2470-1173.2018.13.ipas-389 2018, Society for Imaging Science and Technology Non-Linear Masking based Contrast Enhancement via Illumination Estimation Soonyoung Hong, Minsub

More information

Automatic Categorization of Image Regions using Dominant Color based Vector Quantization

Automatic Categorization of Image Regions using Dominant Color based Vector Quantization Automatic Categorization of Image Regions using Dominant Color based Vector Quantization Md Monirul Islam, Dengsheng Zhang, Guojun Lu Gippsland School of Information Technology, Monash University Churchill

More information

A new predictive image compression scheme using histogram analysis and pattern matching

A new predictive image compression scheme using histogram analysis and pattern matching University of Wollongong Research Online University of Wollongong in Dubai - Papers University of Wollongong in Dubai 00 A new predictive image compression scheme using histogram analysis and pattern matching

More information

Top-k Keyword Search Over Graphs Based On Backward Search

Top-k Keyword Search Over Graphs Based On Backward Search Top-k Keyword Search Over Graphs Based On Backward Search Jia-Hui Zeng, Jiu-Ming Huang, Shu-Qiang Yang 1College of Computer National University of Defense Technology, Changsha, China 2College of Computer

More information

The Dynamic Hungarian Algorithm for the Assignment Problem with Changing Costs

The Dynamic Hungarian Algorithm for the Assignment Problem with Changing Costs The Dynamic Hungarian Algorithm for the Assignment Problem with Changing Costs G. Ayorkor Mills-Tettey Anthony Stentz M. Bernardine Dias CMU-RI-TR-07-7 July 007 Robotics Institute Carnegie Mellon University

More information

Algorithm That Mimics Human Perceptual Grouping of Dot Patterns

Algorithm That Mimics Human Perceptual Grouping of Dot Patterns Algorithm That Mimics Human Perceptual Grouping of Dot Patterns G. Papari and N. Petkov Institute of Mathematics and Computing Science, University of Groningen, P.O.Box 800, 9700 AV Groningen, The Netherlands

More information

A Simple and Strong Algorithm for Reconfiguration of Hexagonal Metamorphic Robots

A Simple and Strong Algorithm for Reconfiguration of Hexagonal Metamorphic Robots 50 A Simple and Strong Algorithm for Reconfiguration of Hexagonal Metamorphic Robots KwangEui Lee Department of Multimedia Engineering, Dongeui University, Busan, Korea Summary In this paper, we propose

More information

6. Concluding Remarks

6. Concluding Remarks [8] K. J. Supowit, The relative neighborhood graph with an application to minimum spanning trees, Tech. Rept., Department of Computer Science, University of Illinois, Urbana-Champaign, August 1980, also

More information

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data

Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data Outlier Detection Using Unsupervised and Semi-Supervised Technique on High Dimensional Data Ms. Gayatri Attarde 1, Prof. Aarti Deshpande 2 M. E Student, Department of Computer Engineering, GHRCCEM, University

More information

Redundancy Resolution by Minimization of Joint Disturbance Torque for Independent Joint Controlled Kinematically Redundant Manipulators

Redundancy Resolution by Minimization of Joint Disturbance Torque for Independent Joint Controlled Kinematically Redundant Manipulators 56 ICASE :The Institute ofcontrol,automation and Systems Engineering,KOREA Vol.,No.1,March,000 Redundancy Resolution by Minimization of Joint Disturbance Torque for Independent Joint Controlled Kinematically

More information

A Novel Image Retrieval Method Using Segmentation and Color Moments

A Novel Image Retrieval Method Using Segmentation and Color Moments A Novel Image Retrieval Method Using Segmentation and Color Moments T.V. Saikrishna 1, Dr.A.Yesubabu 2, Dr.A.Anandarao 3, T.Sudha Rani 4 1 Assoc. Professor, Computer Science Department, QIS College of

More information

QUERY REGION DETERMINATION BASED ON REGION IMPORTANCE INDEX AND RELATIVE POSITION FOR REGION-BASED IMAGE RETRIEVAL

QUERY REGION DETERMINATION BASED ON REGION IMPORTANCE INDEX AND RELATIVE POSITION FOR REGION-BASED IMAGE RETRIEVAL International Journal of Technology (2016) 4: 654-662 ISSN 2086-9614 IJTech 2016 QUERY REGION DETERMINATION BASED ON REGION IMPORTANCE INDEX AND RELATIVE POSITION FOR REGION-BASED IMAGE RETRIEVAL Pasnur

More information

An Improved KNN Classification Algorithm based on Sampling

An Improved KNN Classification Algorithm based on Sampling International Conference on Advances in Materials, Machinery, Electrical Engineering (AMMEE 017) An Improved KNN Classification Algorithm based on Sampling Zhiwei Cheng1, a, Caisen Chen1, b, Xuehuan Qiu1,

More information

COLOR FEATURE EXTRACTION FOR CBIR

COLOR FEATURE EXTRACTION FOR CBIR COLOR FEATURE EXTRACTION FOR CBIR Dr. H.B.KEKRE Senior Professor, Computer Engineering Department, Mukesh Patel School of Technology Management and Engineering, SVKM s NMIMS UniversityMumbai-56, INDIA

More information

Distributed k-nn Query Processing for Location Services

Distributed k-nn Query Processing for Location Services Distributed k-nn Query Processing for Location Services Jonghyeong Han 1, Joonwoo Lee 1, Seungyong Park 1, Jaeil Hwang 1, and Yunmook Nah 1 1 Department of Electronics and Computer Engineering, Dankook

More information

Color-Texture Segmentation of Medical Images Based on Local Contrast Information

Color-Texture Segmentation of Medical Images Based on Local Contrast Information Color-Texture Segmentation of Medical Images Based on Local Contrast Information Yu-Chou Chang Department of ECEn, Brigham Young University, Provo, Utah, 84602 USA ycchang@et.byu.edu Dah-Jye Lee Department

More information

Clustering For Similarity Search And Privacyguaranteed Publishing Of Hi-Dimensional Data Ashwini.R #1, K.Praveen *2, R.V.

Clustering For Similarity Search And Privacyguaranteed Publishing Of Hi-Dimensional Data Ashwini.R #1, K.Praveen *2, R.V. Clustering For Similarity Search And Privacyguaranteed Publishing Of Hi-Dimensional Data Ashwini.R #1, K.Praveen *2, R.V.Krishnaiah *3 #1 M.Tech, Computer Science Engineering, DRKIST, Hyderabad, Andhra

More information

Robust Shape Retrieval Using Maximum Likelihood Theory

Robust Shape Retrieval Using Maximum Likelihood Theory Robust Shape Retrieval Using Maximum Likelihood Theory Naif Alajlan 1, Paul Fieguth 2, and Mohamed Kamel 1 1 PAMI Lab, E & CE Dept., UW, Waterloo, ON, N2L 3G1, Canada. naif, mkamel@pami.uwaterloo.ca 2

More information

An Introduction to Content Based Image Retrieval

An Introduction to Content Based Image Retrieval CHAPTER -1 An Introduction to Content Based Image Retrieval 1.1 Introduction With the advancement in internet and multimedia technologies, a huge amount of multimedia data in the form of audio, video and

More information

Content Based Image Retrieval: Survey and Comparison between RGB and HSV model

Content Based Image Retrieval: Survey and Comparison between RGB and HSV model Content Based Image Retrieval: Survey and Comparison between RGB and HSV model Simardeep Kaur 1 and Dr. Vijay Kumar Banga 2 AMRITSAR COLLEGE OF ENGG & TECHNOLOGY, Amritsar, India Abstract Content based

More information

Data Filtering Using Reverse Dominance Relation in Reverse Skyline Query

Data Filtering Using Reverse Dominance Relation in Reverse Skyline Query Data Filtering Using Reverse Dominance Relation in Reverse Skyline Query Jongwan Kim Smith Liberal Arts College, Sahmyook University, 815 Hwarang-ro, Nowon-gu, Seoul, 01795, Korea. ORCID: 0000-0003-4716-8380

More information

1314. Estimation of mode shapes expanded from incomplete measurements

1314. Estimation of mode shapes expanded from incomplete measurements 34. Estimation of mode shapes expanded from incomplete measurements Sang-Kyu Rim, Hee-Chang Eun, Eun-Taik Lee 3 Department of Architectural Engineering, Kangwon National University, Samcheok, Korea Corresponding

More information

An Edge-Based Algorithm for Spatial Query Processing in Real-Life Road Networks

An Edge-Based Algorithm for Spatial Query Processing in Real-Life Road Networks An Edge-Based Algorithm for Spatial Query Processing in Real-Life Road Networks Ye-In Chang, Meng-Hsuan Tsai, and Xu-Lun Wu Abstract Due to wireless communication technologies, positioning technologies,

More information

III. VERVIEW OF THE METHODS

III. VERVIEW OF THE METHODS An Analytical Study of SIFT and SURF in Image Registration Vivek Kumar Gupta, Kanchan Cecil Department of Electronics & Telecommunication, Jabalpur engineering college, Jabalpur, India comparing the distance

More information

Image Classification Using Wavelet Coefficients in Low-pass Bands

Image Classification Using Wavelet Coefficients in Low-pass Bands Proceedings of International Joint Conference on Neural Networks, Orlando, Florida, USA, August -7, 007 Image Classification Using Wavelet Coefficients in Low-pass Bands Weibao Zou, Member, IEEE, and Yan

More information

Tracking and Recognizing People in Colour using the Earth Mover s Distance

Tracking and Recognizing People in Colour using the Earth Mover s Distance Tracking and Recognizing People in Colour using the Earth Mover s Distance DANIEL WOJTASZEK, ROBERT LAGANIÈRE S.I.T.E. University of Ottawa, Ottawa, Ontario, Canada K1N 6N5 danielw@site.uottawa.ca, laganier@site.uottawa.ca

More information

An Efficient Semantic Image Retrieval based on Color and Texture Features and Data Mining Techniques

An Efficient Semantic Image Retrieval based on Color and Texture Features and Data Mining Techniques An Efficient Semantic Image Retrieval based on Color and Texture Features and Data Mining Techniques Doaa M. Alebiary Department of computer Science, Faculty of computers and informatics Benha University

More information

Content Based Image Retrieval Using Color Quantizes, EDBTC and LBP Features

Content Based Image Retrieval Using Color Quantizes, EDBTC and LBP Features Content Based Image Retrieval Using Color Quantizes, EDBTC and LBP Features 1 Kum Sharanamma, 2 Krishnapriya Sharma 1,2 SIR MVIT Abstract- To describe the image features the Local binary pattern (LBP)

More information

Study on the Signboard Region Detection in Natural Image

Study on the Signboard Region Detection in Natural Image , pp.179-184 http://dx.doi.org/10.14257/astl.2016.140.34 Study on the Signboard Region Detection in Natural Image Daeyeong Lim 1, Youngbaik Kim 2, Incheol Park 1, Jihoon seung 1, Kilto Chong 1,* 1 1567

More information

A Two-phase Distributed Training Algorithm for Linear SVM in WSN

A Two-phase Distributed Training Algorithm for Linear SVM in WSN Proceedings of the World Congress on Electrical Engineering and Computer Systems and Science (EECSS 015) Barcelona, Spain July 13-14, 015 Paper o. 30 A wo-phase Distributed raining Algorithm for Linear

More information

Open Access The Three-dimensional Coding Based on the Cone for XML Under Weaving Multi-documents

Open Access The Three-dimensional Coding Based on the Cone for XML Under Weaving Multi-documents Send Orders for Reprints to reprints@benthamscience.ae 676 The Open Automation and Control Systems Journal, 2014, 6, 676-683 Open Access The Three-dimensional Coding Based on the Cone for XML Under Weaving

More information

An Efficient Methodology for Image Rich Information Retrieval

An Efficient Methodology for Image Rich Information Retrieval An Efficient Methodology for Image Rich Information Retrieval 56 Ashwini Jaid, 2 Komal Savant, 3 Sonali Varma, 4 Pushpa Jat, 5 Prof. Sushama Shinde,2,3,4 Computer Department, Siddhant College of Engineering,

More information

A reversible data hiding based on adaptive prediction technique and histogram shifting

A reversible data hiding based on adaptive prediction technique and histogram shifting A reversible data hiding based on adaptive prediction technique and histogram shifting Rui Liu, Rongrong Ni, Yao Zhao Institute of Information Science Beijing Jiaotong University E-mail: rrni@bjtu.edu.cn

More information

FIR Filter Synthesis Algorithms for Minimizing the Delay and the Number of Adders

FIR Filter Synthesis Algorithms for Minimizing the Delay and the Number of Adders 770 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: ANALOG AND DIGITAL SIGNAL PROCESSING, VOL. 48, NO. 8, AUGUST 2001 FIR Filter Synthesis Algorithms for Minimizing the Delay and the Number of Adders Hyeong-Ju

More information

BAG-OF-VISUAL WORDS (BoVW) MODEL BASED APPROACH FOR CONTENT BASED IMAGE RETRIEVAL (CBIR) IN PEER TO PEER (P2P)NETWORKS.

BAG-OF-VISUAL WORDS (BoVW) MODEL BASED APPROACH FOR CONTENT BASED IMAGE RETRIEVAL (CBIR) IN PEER TO PEER (P2P)NETWORKS. BAG-OF-VISUAL WORDS (BoVW) MODEL BASED APPROACH FOR CONTENT BASED IMAGE RETRIEVAL (CBIR) IN PEER TO PEER (P2P)NETWORKS. 1 R.Lavanya, 2 E.Lavanya, 1 PG Scholar, Dept Of Computer Science Engineering,Mailam

More information

Data Clustering Hierarchical Clustering, Density based clustering Grid based clustering

Data Clustering Hierarchical Clustering, Density based clustering Grid based clustering Data Clustering Hierarchical Clustering, Density based clustering Grid based clustering Team 2 Prof. Anita Wasilewska CSE 634 Data Mining All Sources Used for the Presentation Olson CF. Parallel algorithms

More information

A New Feature Local Binary Patterns (FLBP) Method

A New Feature Local Binary Patterns (FLBP) Method A New Feature Local Binary Patterns (FLBP) Method Jiayu Gu and Chengjun Liu The Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, USA Abstract - This paper presents

More information

Distance-based Outlier Detection: Consolidation and Renewed Bearing

Distance-based Outlier Detection: Consolidation and Renewed Bearing Distance-based Outlier Detection: Consolidation and Renewed Bearing Gustavo. H. Orair, Carlos H. C. Teixeira, Wagner Meira Jr., Ye Wang, Srinivasan Parthasarathy September 15, 2010 Table of contents Introduction

More information

Improving 3D Shape Retrieval Methods based on Bag-of Feature Approach by using Local Codebooks

Improving 3D Shape Retrieval Methods based on Bag-of Feature Approach by using Local Codebooks Improving 3D Shape Retrieval Methods based on Bag-of Feature Approach by using Local Codebooks El Wardani Dadi 1,*, El Mostafa Daoudi 1 and Claude Tadonki 2 1 University Mohammed First, Faculty of Sciences,

More information

A Modified Mean Shift Algorithm for Visual Object Tracking

A Modified Mean Shift Algorithm for Visual Object Tracking A Modified Mean Shift Algorithm for Visual Object Tracking Shu-Wei Chou 1, Chaur-Heh Hsieh 2, Bor-Jiunn Hwang 3, Hown-Wen Chen 4 Department of Computer and Communication Engineering, Ming-Chuan University,

More information

SA-IFIM: Incrementally Mining Frequent Itemsets in Update Distorted Databases

SA-IFIM: Incrementally Mining Frequent Itemsets in Update Distorted Databases SA-IFIM: Incrementally Mining Frequent Itemsets in Update Distorted Databases Jinlong Wang, Congfu Xu, Hongwei Dan, and Yunhe Pan Institute of Artificial Intelligence, Zhejiang University Hangzhou, 310027,

More information

Online algorithms for clustering problems

Online algorithms for clustering problems University of Szeged Department of Computer Algorithms and Artificial Intelligence Online algorithms for clustering problems Summary of the Ph.D. thesis by Gabriella Divéki Supervisor Dr. Csanád Imreh

More information

Using the Kolmogorov-Smirnov Test for Image Segmentation

Using the Kolmogorov-Smirnov Test for Image Segmentation Using the Kolmogorov-Smirnov Test for Image Segmentation Yong Jae Lee CS395T Computational Statistics Final Project Report May 6th, 2009 I. INTRODUCTION Image segmentation is a fundamental task in computer

More information

Quadrant-Based MBR-Tree Indexing Technique for Range Query Over HBase

Quadrant-Based MBR-Tree Indexing Technique for Range Query Over HBase Quadrant-Based MBR-Tree Indexing Technique for Range Query Over HBase Bumjoon Jo and Sungwon Jung (&) Department of Computer Science and Engineering, Sogang University, 35 Baekbeom-ro, Mapo-gu, Seoul 04107,

More information

Schema Matching with Inter-Attribute Dependencies Using VF2 Approach

Schema Matching with Inter-Attribute Dependencies Using VF2 Approach International Journal of Emerging Engineering Research and Technology Volume 2, Issue 3, June 2014, PP 14-20 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) Schema Matching with Inter-Attribute Dependencies

More information

Content-based Image Retrieval (CBIR)

Content-based Image Retrieval (CBIR) Content-based Image Retrieval (CBIR) Content-based Image Retrieval (CBIR) Searching a large database for images that match a query: What kinds of databases? What kinds of queries? What constitutes a match?

More information

Automatic Ranking of Images on the Web

Automatic Ranking of Images on the Web Automatic Ranking of Images on the Web HangHang Zhang Electrical Engineering Department Stanford University hhzhang@stanford.edu Zixuan Wang Electrical Engineering Department Stanford University zxwang@stanford.edu

More information

Efficient Content Based Image Retrieval System with Metadata Processing

Efficient Content Based Image Retrieval System with Metadata Processing IJIRST International Journal for Innovative Research in Science & Technology Volume 1 Issue 10 March 2015 ISSN (online): 2349-6010 Efficient Content Based Image Retrieval System with Metadata Processing

More information

A New Approach for Shape Dissimilarity Retrieval Based on Curve Evolution and Ant Colony Optimization

A New Approach for Shape Dissimilarity Retrieval Based on Curve Evolution and Ant Colony Optimization Proc. Int. Conf. on Recent Trends in Information Processing & Computing, IPC A New Approach for Shape Dissimilarity Retrieval Based on Curve Evolution and Ant Colony Optimization Younes Saadi 1, Rathiah

More information

Blind Measurement of Blocking Artifact in Images

Blind Measurement of Blocking Artifact in Images The University of Texas at Austin Department of Electrical and Computer Engineering EE 38K: Multidimensional Digital Signal Processing Course Project Final Report Blind Measurement of Blocking Artifact

More information