Partial-Duplicate Image Retrieval via Saliency-guided Visually Matching

Size: px
Start display at page:

Download "Partial-Duplicate Image Retrieval via Saliency-guided Visually Matching"

Transcription

1 Partial-Duplicate Image Retrieval via Saliency-guided Visually Matching Liang Li, Shuqiang Jiang, Zheng-Jun Zha, Zhipeng Wu, Qingming Huang Key Lab of Intell. Info. Process., Institute of Computing Technology, Chinese Academy of Sciences, China Dept. of Information and Communication Eng. The University of Tokyo, Japan School of Computing, National University of Singapore, Singapore Graduate University of Chinese Academy of Sciences, China {lli, sqjiang, Abstract This paper proposes a novel partial-duplicate image retrieval scheme based on saliency-guided visually matching, where the localization of duplicate is done simultaneously. The image is abstracted by Visually Salient and Rich Regions (VSRR), which are of both visual saliency and rich visual content. Further, a relative saliency ordering constraint is constructed to refine the retrieval, which captures the robust relative saliency layout of VSRR. Collaborating with this constraint, we propose an efficient algorithm to embed it into the index system to speedup the retrieval. Comparison experiments with state-of-the-art methods on five databases show the efficiency and effectiveness of our approach. Index Terms Image representation, saliency analysis, relative constraint, partial-duplicate image retrieval Fig.. Example images from a public partial-duplicate image dataset []. I. I NTRODUCTION ITH the rapid improvement of internet and multimedia technology, plenty of partial-duplicate images are generated for picture sharing, information delivering, photo ornamenting, etc. Fig. shows some typical examples of partial-duplicate images. Different from the traditional image retrieval, the duplicate regions among partial-duplicate images are only parts of the whole images while there are various kinds of transformations on scale, viewpoint, illumination and resolution. Although this make the retrieval task more complicated and challenging, partial-duplicate image retrieval is demanded by various real applications (e.g. fake image detection, copy protection, and landmark searching), and has attracted increasing research attentions recently. The problem of partial-duplicate image retrieval is close to that of object-based image retrieval. The traditional objectbased image retrieval methods usually use the whole image as the query and analogize text-retrieval systems by using the bag-of -visual-words (BOV) model [2]. Typically, no spatial information about visual words is used in the retrieval stage so that it is not robust to the background noise. Thus, researchers have been solving this problem at different aspects: ) Aggregating local descriptors into more discriminative descriptions [3], [4]. 2) Constructing weak geometric consistency constraint on the global image representation with local feature [], [5] [7]. 3) Retrieving with a user interested region through user interaction (e.g. Blobworld, SIMPLICity). W The above technologies have proved their efficiency in the traditional image retrieval, but there are some immitigable limitations: ) The aggregating approaches are limited on the partial-duplicate image retrieval: the duplicate region is usually small and only a bit of aggregated features may be extracted; 2) Only constructing such the consistency constraint could not effectively deal with the partial-duplicate image retrieval because there usually exist plenty of background noises in the images; 3) It is impossible that the similar interaction operations are performed on the millions of images in the dataset. Due to these limitations, current technologies are not sufficient to satisfactorily solve the partial-duplicate image retrieval, but they provide some useful insights: ) Removing the background noise from the query image and image dataset will facilitate the retrieval. Further, it is preferable that noise elimination can be automatically implemented. 2) The appropriate constraint can improve the retrieval performance, but the constraint should not reduce the retrieval efficiency. Besides the above insights, there are two observations worth noticing: ) the purpose of people sharing images on the web is to show their interested objects/regions in the images. Similarly, when retrieving the partial duplicate, one also expects that major regions of the returned results are about his interested objects/regions. Usually, in the retrieval, the similar regions concerned by the user are not in the nonsalient

2 2 regions. 2) On the view of user experience, the similar region in the returned images, which the user takes an interest in, had better be interesting for the original image sharers. To solve the partial-duplicate image retrieval, we first introduce the visual attention analysis [8], [9] to filter the nonsalient regions from the image, which can eliminate some background noises of images. Computationally removing the nonsalient regions from the images is a useful solution for preferentially allocating of computational resources in subsequent image analysis. Another characteristic of the partial-duplicate regions is to have rich visual content. Previous technologies cannot guarantee the generated saliency regions with rich visual content. To ensure the regions with rich visual content, we introduce a visual content analysis algorithm to re-filter the saliency regions. In this paper, we propose a novel partial-duplicate image retrieval scheme based on saliency-guided visually matching, and the localization of duplicate is obtained simultaneously. Fig. 2 illustrates the flowchart of our scheme. We abstract the Visually Salient and Rich Regions (VSRR) as retrieval units, which are of visual saliency and rich visual content in the images. We represent the VSRR based on the BOV model; and we take advantages of group sparse coding to encode the visual descriptor, which achieves a lower reconstruction error and obtains sparse representation at the region level. Furthermore, a robust relative constraint based on the saliency analysis is introduced to refine the retrieval performance, which captures the saliency relative layout among interest points of the VSRRs. To accelerate the retrieval process, we propose an efficient algorithm to embed this constraint into the index system, which economizes both the computation time and the storage spaces. Finally, experiments on five image databases for partial-duplicate image retrieval show the efficiency and effectiveness of our approach. The contributions of our work are summarized as follows: A novel partial-duplicate image retrieval scheme is proposed based on saliency-guided visually matching, where the VSRR is used as the retrieval unit and the localization of duplicate is done simultaneously. The VSRR is represented with sparse code of visual words by group sparse algorithm, which can achieve a more exact reconstruction and obtain sparse representation at the whole VSRR. A relative saliency ordering constraint is proposed to raise the retrieval precision, which captures the saliency relative layout among key points of the VSRR. To the best of our knowledge, this is the first time to exploit such a constraint. A fast algorithm is proposed to transform the above constraint into the index system, which dramatically diminishes both the computation time and storage spaces. The rest of this paper is organized as follows: Section II introduces the generation of VSRR. Section III describes the feature representation for VSRR. Section IV elaborates the relative saliency ordering constraint. Section V presents the index structure and retrieval scheme. Section VI discusses the experimental results. Finally, Section VII concludes this paper. Image corpus VSRR extraction & group sparse coding (offline) VSRR with sparse code Returned result Storage (offline) st inverted file Candidate VSRR 2 nd inverted file Relative saliency ordering constraint visual attention analysis visual content filter group sparse coding query sparse code Query image Saliency regions VSRR Fig. 2. The flowchart of our proposed scheme for partial-duplicate image retrieval. II. VISUALLY SALIENT AND RICH REGION GENERATION In this paper we detect the Visually Salient and Rich Regions (VSRR) as the retrieval unit for partial-duplicate image retrieval. We define VSRR as one image region that has rich visual content and visual saliency. The generation procedure of VSRR includes four steps: a) perceptive unit construction; b) saliency map generation; c) original VSRR generation; d) ultimate VSRR selection. Finally, the image is decomposed into a set of VSRRs. A. Perceptive Unit Construction Perceptive unit is defined as an image patch which corresponds to the center of the receptive field. Here we utilize a reliable graph-based segmentation algorithm [0], which incrementally merges smaller-size patches with similar appearances and with small minimum spanning tree weights. B. Saliency Map Generation The image regions that have strong contrast with their surroundings usually attract much human attention. Besides contrast, spatial relationships also play an important role in visual attention. For one region, high contrast to its near regions usually has a more powerful impetus for its saliency than high contrast to far regions. Based on the perceptive units, we compute the saliency map by incorporating spatial relationship with the region contrast [8]. This approach is the bottom-up saliency detection strategy and it can separate objects from the surroundings. Specifically, color histogram for each perceptive unit is firstly built in the L*a*b* color space. Then, for a perceptive unit r k, its saliency value is calculated by measuring its color contrast to all other perceptive units in the image. Meanwhile a weight term of spatial relationship is introduced to increase

3 3 the effects of closer regions and decrease the effects of farther regions. This procedure is defined as: S(r k ) = r k r i exp(d s (r k, r i )/σ 2 s)w(r i )D r (r k, r i ) () where exp(d s (r k, r i )/σs) 2 is the spatial weight and D s (r k, r i ) is the Euclidean spatial distance between the centroid of perceptive unit r k and r i. σ s controls the scale of spatial weight. Smaller value of σ s enlarges the effect of spatial weighting. w(r i ) is the number of pixels in the region r i. D r (r k, r i ) is the color distance between r k and r i, which is defined as: D r (r k, r i ) = N k N i m= n= f(c k,m )f(c i,n )D(c k,m, c i,n ) (2) where D(c k,m, c i,n ) is the distance between pixels c k,m and c i,n. f(c k,m ) is the probability of the m-th color c k,m among all N k colors in the region r k. C. VSRR Generation After generating saliency map, we compute the saliency regions by saliency segmentation, and then obtain the original VSRRs by filtering the regions with inferior saliency. Finally, we select out the ultimate VSRRs which contain abundant visual content. We divide the saliency map into background and initial saliency regions by binarizing the saliency map using a fixed threshold. Then, we iteratively apply Grabcut [] to refine the segmentation result. The initial saliency regions are used to automatically initialize Grabcut. Once initialized, Grabcut is iteratively performed to improve the saliency cut result until the region is convergent. Finally, a set of regions are obtained, which are regarded as the original VSRRs. Further, to obtain the ultimate VSRRs that contain abundant visual content, we re-filter the VSRR by a visual content analysis algorithm. Firstly, the visual content of VSRRs is represented with the BOV representation of SIFT [2] descriptor. The amount of visual content in the VSRR is measured as: Score = K i= N i n i (3) where K is the size of dictionary; n i and N i are the numbers of visual word i in this VSRR and the whole database respectively. n i reflects the repeated structure in the VSRR. N i captures the informativeness of visual words: visual words that appear in many different VSRRs are less informative than those that appear rarely. We filter the VSRRs with Score<ε. ε is set to 0.0. (This dictionary is also used in the Section IV, which is obtained by hierarchical k-means clustering. And it is different from that in the Section III, which is learnt by the group sparse coding algorithm. These two dictionary is used for different purposes.) III. FEATURE REPRESENTATION OF VSRR After obtaining the VSRR, we move our efforts to design its discriminative representation. The popular image representation in image retrieval is the nearest neighbor vector quantization (VQ) based on the BOV model [2]. However, the discriminative power of traditional BOV model is limited due to quantization error. To address this problem, we improve the BOV by using group sparse coding (GSC [3]). Let X = [x,..., x M ] T R M P be a set of local descriptors: min A,D 2 D D M x m a m j d j 2 + γ a j (4) m= j= subject to a m j 0, j, m j= where D = [d,..., d K ] T is the dictionary with K visual words. depicts the l 2 -norm of vector. The reconstruction matrix A = {a j } D j= consists of nonnegative vectors a j = (a j,..., am j ), which specifies the contribution of d j to each descriptor. M is the total number of visual descriptors in the VSRR. The first term in the Equ. 4 weighs the reconstruction error and the second term weighs the degree of sparsity. The larger the parameter γ is, the sparser the reconstruction coefficient is. The advantages of this approach over the traditional BOV include: ) GSC can achieve a much lower reconstruction error rate due to the less restrictive constraint; 2) GSC representation economizes the storage space in encoding image descriptors; 3) Sparsity of GSC allows the representation to be special, and to capture salient properties of images; 4) GSC can obtain the compact representation at the region level, which facilitates the fast index and accurate retrieval. IV. RELATIVE SALIENCY ORDERING CONSTRAINT The ignorance of geometric relationship limits the discriminative power of the BOV model. To address this problem, we propose a novel relative saliency ordering constraint, which is regarded as a saliency-relative layout constraint among interest points in the VSRR. We argue that the relative order of saliency at the interest points is well-preserved in the VSRRs, since the duplicate VSRRs has the common visual pattern and their saliency information distribution is similar. The first step for constraint verification is to find the matching pairs between VSRRs. Here we employ visual words with a large dictionary for efficiently matching. Large dictionary can decrease the matching errors caused by the SIFT quantization. Supposing one query VSRR q and one candidate VSRR c have n matching visual words: VSRR(q) = {v q,..., v qn } and VSRR(c) = {v c,..., v cn }, and v qi and v ci are the i-th matching visual word. S(q) = {α q,..., α qn } and S(c) = {α c,..., α cn } represent the saliency values for the corresponding visual words in q and c respectively. We construct a Saliency Relative Matrix (SRM) for each VSRR: r 2 r n r 2 r 2n SRM = r n r n2 { 0 αi > α j r ij = otherwise (5)

4 4 visually salient and rich region saliency map visual word & its saliency value Saliency ordinal Saliency Relative Matrix q XOR c (a) (b) (c) (d) (e) (f) Fig. 3. (a) VSRR with visual words (colored points); (b) the corresponding saliency map with (a); (c) saliency value at the position of visual word (d) saliency ordinal vector of visual words in (a); (e) Saliency Relative Matrix (SRM) of visual words in (a); (f) XOR result of two SRMs in (d). The BOOL element r ij in SRM is defined by comparing the saliency values α i and α j of visual word v i and v j in one VSRR. Each visual word in the VSRR is compared with other visual words. The SRM is an anti-symmetric BOOL matrix which preserves the relative saliency order among visual words. Fig. 3-(e) illustrates the SRM of VSRR q and VSRR c. We measure the inconsistency between the query SRM and the candidate SRM by using the Hamming distance: Dis = SRM q SRM c 0 (6) where 0 (l 0 -norm) is the total number of non-zero element. Intuitively, SRM is not sensitive to inconsistent saliency order, and it avoids the problem of directly comparing saliency ordinal vectors. To extend this constraint into large scale image retrieval, we transform it into a simple addition operation based on an offline inverted-file index. This fast algorithm provides the same constraint function, but it does not need to store and compute the SRM of VSRRs from the image datasets, which is detailed in the Section V-B. V. INDEX AND RETRIEVAL In large scale image retrieval systems, efficient indexing is a critical factor for fast retrieval. In this section, we firstly introduce the bilayer inverted-file index structure of our system, and then detail the retrieval scheme based on this index system. A. Index Structure The retrieval procedure can be divided into two steps: one is to search the candidate VSRRs from the dataset; the other is to refine the result via the relative saliency ordering constraint. For the sake of efficiency, we use a collaborative index structure with a bilayer inverted-file. The st invertedfile preserves the VSRR information to boost the first step. The 2 nd inverted-file stores the saliency order of visual words in each VSRR to verify the further constraint. The output of the first step is the ID of candidate VSRR and image, which is a direct index for the 2 nd inverted-file. The st inverted-file index structure. One VSRR k in an image is represented as a vector VSRR k, with one component for each visual word in the dictionary D. For each visual word, this structure stores the list of VSRR in which the visual word occurs and its term weight. Fig. 4-(a) illustrates the structure of the st inverted-file. VW Freq. is the sum of weight of visual word i for all the VSRRs. visual word weight is the code of the visual word i by group sparse coding for one VSRR. Due to the GSC algorithm, this vector representation is very sparse. The st inverted-file structure utilizes this sparseness to index images and enables fast searching of candidate VSRR. The 2 nd inverted-file index structure. Fig. 4-(b) shows its structure. The 2 nd inverted-file stores the property information of each VSRR. VW count is the count of visual words in this VSRR. VSRR area is the count of pixel in this VSRR. VW i is the ID of visual word i. These visual words are arranged with a sort ascending of saliency value. Note that the visual word dictionary D in this structure is different from the dictionary D in the st inverted-file, where the dictionary D is learnt by the GSC. This dictionary D is obtained by a hierarchical k-means clustering. We use a very large dictionary D to fast search the matching point pairs between the query VSRR and candidate VSRR. B. A Fast Algorithm for the Relative Saliency Ordering Constraint The critical step of the relative saliency ordering constraint is to construct the Saliency Relative Matrix (SRM) of query VSRR and candidate VSRR. In fact, through a simple transformation, we just need to completely compute the SRM of query VSRR once, and then extract different rows and columns to build new SRM for satisfying the demands of different candidate VSRRs. Recalling the structure of SRM (Equ. 5), the row(i) indicates the relative saliency relationship between visual word i and

5 5 (a) (b) Fig. 4. visual word i Image ID VSRR ID VW Freq. VW count VSRR area Image ID VSRR ID Visual word weight 20 bits 4bits VW i VW k Indexed Item 32 bits VW j (a) The st inverted-file structure. (b) The 2 nd inverted-file structure. other visual words. If we reset the order of its rows and columns with a sort ascending of saliency value, according to the definition of Equ. 5, the SRM can be transformed as: 0 SRM + = (7) 0 0 The SRM + is essentially equivalent with the primal SRM. Moreover, for any SRM, when the order of its rows and columns is a sort ascending of saliency value, its SRM is unique: its upper triangle matrix is ones matrix and it is an anti-symmetric BOOL matrix. Thus, the SRM + does not need to be stored at all. Inspired by the above discovery, we design the 2 nd invertedfile index as Fig. 4-(b). Only the ID of visual words of each VSRR is stored with a sort ascending of saliency value. This can economize a huge memory space, and the corresponding inverted-file occupies.5g memory space in our experiment on Million image dataset. For each candidate VSRR, at first, we need to find the matching pairs of visual words with query VSRR. Here, we take the visual word from the candidate VSRR sequentially as the input to search its matching visual word in the query VSRR. Suppose that there are N matching visual words, the inconsistency between VSRRs is 2 M N, M is the total 2 number of zero element in the upper triangular matrix of SRM, which is defined in the Equ. 5. With respect to the time complexity, this approach is fast and satisfying: ) we do not need to compute the SRM of candidate; 2) the inconsistency is calculated directly by adding operation and avoiding the XOR operation. To sum up, this algorithm for the relative saliency ordering constraint saves both the computation time and the storage spaces, and thus it is suitable for the large scale index system. C. Retrieval Scheme Given a query VSRR q, the first task is to find its candidate VSRRs. The procedure can be interpreted as a voting scheme: ) The score of all VSRRs in the database are initialized to 0; 2) For each visual word j in the query, we retrieve the list of VSRRs that contain this visual word through the st inverted files. For each VSRR i in the list, we increase its score by the weight of this visual word score(i)+ =(weight of visual word j ) /(VW j Freq.). After processing all visual words in the query, the final score of VSRR i is the dot product between the vectors of VSRR i and the query q. 3) We normalize the scores to obtain the cosine similarities for ranking. After finding the candidate VSRRs, we compute the inconsistency of SRM between the query VSRR q and candidate VSRR c, which is detailed in the Section V-B. We define a total matching score M(q; c): M(q; c) = M v (q; c) + λm r (q; c) (8) where M v (q; c) is the visual similarity, which measures the cosine similarity between q and c; M r (q; c) is the consistency of relative saliency ordering constraint, which is euqal to - inconsistency of SRM(q; c). λ is a weight parameter. After obtaining the similar score between two VSRRs, we define the similarity between the query image I q and the candidate image I m as: Sim(I q, I m ) = 2 sqrt(r area(q i )) M(q + R area (q q i I q i ) i ; m i ) (9) where q i is the i-th VSRR in the query image I q, and m i is the corresponding matching VSRR in the I m with the q i. R area (q i ) is the area ratio of the VSRR q i relative to the query image. The fractional term is the weight of the similarity of the i-th matching VSRR pair. On the view of users experience, they are more satisfied with such returned results, where the similar region has larger overlapping area in the original query image. Inspired by this observation, we re-rank the returned images by the image similarity Sim. VI. EXPERIMENTS A. Datasets and Evaluation Metric To evaluate our system, we compare our method with the state-of-the-art approaches on five image datasets: ()PDID: a public internet partial-duplicate image dataset [], which consists of 0 image collections and 30K distractors. Each collection has 200 images. (2) PDIDM: a large scale partialduplicate image dataset, which consists of the PDID dataset and one million distracter images from the Flickr. (3) UKbench. This dataset consists of 2550 groups of 4 images each. (4) Mobile: this dataset [4] includes images of 300 objects, and it indexes 300 images of the digital copies downloaded from internet blended by the 0200 images from the UKbench as distractors. (5) Caltech256(50): we choose a subset of Caltech256, which consists of 50 classical object categories with 4988 images. In evaluation, for the datasets of PDID, PDIDM and Caltech256, we use mean average precision (map) as the evaluation metric. For each query image, we compute its Average Precision (AP), which is the area under the precisionrecall curve, and then take the mean value over all query images. For the UKbench dataset, the performance measure

6 6 0.9 BOV Soft BOV bundled feature bundled+ VLAD our approach map American Flag Beijing Olympic Disney Google iphone KFC Monalisa Rocket Starbucks Exit Sign Fig. 5. Comparisons of five approaches with the map for partial-duplicate image retrieval. is the average number of relevant images in the querys top 4 retrieved images. For the Mobile dataset, the top-0 hit rates is used as in [4]. B. Performance Evaluation on Technical Improvement and Parameter Setting ) Comparison with different visual descriptor encoding method: The representation of VSRR is a crucial step to affect the retrieval performance, and we represent the VSRR with sparse code of visual words. In this subsection, we evaluate its performance on the PDID dataset. Baseline: we use a traditional BOV approach as the baseline approach and a dictionary with 5,000 visual words is clustered with hierarchical k-means. Comparison: four popular partial-duplicate image retrieval schemes [], [3], [6], [7] are compared with our approach. () We also enhance the baseline method with soft assignment [7], where the number of nearest neighbors is set to 3 and σ 2 =6,250. (later referred as Soft BOV ). (2) In [6], the SIFT and MSER are extracted from images and bundled into groups ( bundled feature for short). (3) In [], bundled feature is improved by adding an affine invariant geometric constraint ( bundled+ for short). All of these approaches share the common SIFT vocabulary of 5,000 visual words. For [6], the weight parameter of geometric consistency is set to 2. For [], the weight parameter of affine consistency is set to. (4) A state-of-the-art descriptor-vlad (vector of locally aggregated descriptors) [3], which derived from both BOV and Fisher kernel. Its cluster centroids k is set to 64, and the final dimension is 892. Experiment setting of our approach: each VSRR is encoded with GSC, where a dictionary with 978 visual words is learnt through the algorithm. In the relative saliency ordering constraint, we use the dictionary with 5,000 visual words from hierarchical k-means. The parameter λ is set to 0.4. Fig. 5 illustrates the experimental results, leading to four observations. Firstly, our approach improves the map obviously, as can be seen by comparing other state-of-the-art methods. The map of our approach is 68.82%, while that of BOV, Soft BOV, bundled feature, bundled+ and VALD are 32.6%, 38.3%, 59.9%, 53.6% and 47.6% respectively. Secondly, the traditional BOV approach lost its discriminative power on the partial-duplicate image retrieval task. The reason lies in the fact that partial-duplicate images have only a small similar portion. Thirdly, our result on Monalisa is not satisfactory, which is caused by the same reason with the above sub-section. The faces of many Monalisa images in the ground truth dataset are detected as the VSRRs; although they are VSRRs indeed, as a result of the manual Photoshop, these faces are very different to reduce the map. Fourthly, as Fig. 5 shows, the performances of bundled feature, bundled+ and our approach are better than those of BOV, Soft BOV and VLAD. 2) Impact of λ: The parameter λ in Equ. 8 determines the weight of the consistency term of relative saliency ordering constraint. We test the performance using different λ values on the PDID dataset. Experiments show that [0.4, 0.8] is the most effective range for λ, specially, the map achieves the highest (0.703) when λ=0.4. The relative saliency ordering constraint plays an important role in improving the retrieval precision. However, relying too much on it will reduce the retrieval recall. C. Partial-duplicate Image Retrieval on Different Image Dataset In this subsection, we validate the efficiency of our approach on the five different image dataset: PDID, PDIDM, UKbench, Mobile, Caltech256. Baseline: the traditional BOV approach [2] is used as the baseline. A dictionary with 0.5M visual words is used, which is obtained by hierarchical k-means clustering on 50K images. Comparisons: () Soft BOV [7], where the number of nearest neighbors is set to 3 and σ 2 =6,250. This methods use the same dictionary with the baseline. (2) bundled feature [6], where the weight parameter of geometric consistency is set to 2. (3) bundled+ [], where the weight parameter of affine consistency is set to be. This method shares the same dictionary with bundled feature, which is clustered into

7 7 TABLE I COMPARISON WITH THE STATE-OF-THE-ART METHODS ON DIFFERENT DATASETS BOV [2] Soft BOV [7] bundled feature [6] bundled+ [] VLAD [3] SCSM [5] k-r NN [6] CW [5] Ours PDID PDIDM UKbench Mobile Caltech256(50) k visual words by hierarchical k-means. (4) VLAD [3], which cluster centroids k is 64. (5) SCSM [5], a spatiallyconstrained similarity measure method for object retrieval, where the grid size of the voting map is 6 6, and σ 2 =2.5 in the Gaussian weights exp( d/σ 2 ). (6) k-r NN [6], an object retrieval scheme based on an analysis of the k-reciprocal nearest neighbor structure in the image space, where, as [6], the parameter cutoff is set to (7) CW [4], a novel vocabulary tree based approach by introducing contextual weighting of local features in both descriptor and spatial domains, where the vocabulary tree is with a branch factor K=8 and depth L=6. (8) For our approach, we use a dictionary with 0236 visual words for sparse coding of VSRR. In the relative saliency ordering constraint, we use the same dictionary with the baseline. The parameter λ in Equ. 8 is set to 0.4. Table I shows the comparisons with the state-of-the-art methods on the five datasets. Most of them use additional techniques such as post-verification and soft assignment. The results of our approach are the best on PDID, PDIDM and Caltech256, and it is significantly better than previously bestknown work bundled feature [6] on partial-duplicate image retrieval on four datasets (from to on PDID, from 3.5 to 3.58 on UKbench, from to 0.85 on Mobile, from to on Caltech256). The performance of our method is lower than the k-r NN method on the UKbench dataset, but it is better than the k-r NN on the Caltech256. The reason may be that the variety of similar images in the UKbench dataset is slight while that in the Caltech256 is very diverse. The method of KNN-wise is effective and fast for the data with trivial variety, but its robustness is not satisfactory for the strong variety of the scale and viewpoint. On the Mobile dataset, the result of our method is 0.03 less than the bestachieved method CW. The query images in this dataset is captured by the mobile phone so that their illumination and viewpoint are unfavorable, which causes bad influence on our scheme. Fig. 6 details the comparison results with four popular image retrieval methods on the PDIDM dataset, leading to three observations: firstly, our approach significantly improves the map, as can be seen by comparing with baseline, our approach boosts the map from 0.26 to 0.523, a 26.2% improvement. Secondly, soft assignment of visual words plays an important role in improving the performance (a 0% improvement on average). Actually, sparse coding is also a soft assignment of visual words. Thirdly, the performance of all the five methods is not satisfactory on the group of Monalisa. The difference of most Monalisa images in the map American Flag Beijing Olympic Disney Google BOV Soft BOV VLAD CW our approach iphone KFC Monalisa Rocket Starbucks Exit Sign Fig. 6. Comparisons of different methods using map with the large scale image dataset. ground truth locates on the faces of Monalisa. Moreover, the face usually has the rich visual information that has a great influence to the global image representation with local feature. As a result, BOV, Soft-BOV, VLAD and CW do not work well. On the other side, most faces in these images are the visually salient and rich regions, which are detected as the VSRR in our approach. This leads to the low map of our method. Besides the obvious improvements on map, it is necessary to point out that our approach is also efficient. The average time of retrieving a query image on the large scale image dataset is 0.695s(BOV), 0.732s(SoftBOV), 0.745s(VLAD), 0.96s(CW) and.20s(ours). The experiments are executed on an Intel Core i5 2.8 GHz machine with 4GB RAM. Although the computation time of our approach is not as fast as the other methods, it is tolerable in the practical application. In short, our approach shows significant advantages on both accuracy and efficiency. VII. CONCLUSION This paper proposes a new prospective for partial-duplicate image retrieval with Visually Salient and Rich Region, which is the visually salient and rich region in an image. A relative saliency ordering constraint is proposed to improve the retrieval performance. Further, we propose a fast algorithm to transform this constraint into the index system for fast retrieval. Comparison experiments on five image dataset validate the effectiveness of our scheme. In the future we plan to have a deep research on both visual attention analysis and visual content analysis, and introduce a practical partial-duplicate image retrieval system.

8 8 ACKNOWLEDGEMENT This work was supported in part by National Basic Research Program of China (973 Program): 202CB36400, in part by National Natural Science Foundation of China: 60250, REFERENCES [] Z. Wu, Q. Xu, S. Jiang, Q. Huang, P. Cui, and L. Li, Adding affine invariant geometric constraint for partial-duplicate image retrieval, in IEEE ICPR, 200., 5, 6, 7 [2] J. Sivic and A. Zisserman, Video google: A text retrieval approach to object matching in videos, in IEEE ICCV, 2003., 3, 6, 7 [3] H. Jégou, M. Douze, C. Schmid, and P. Pérez, Aggregating local descriptors into a compact image representation, in IEEE CVPR, 200., 6, 7 [4] F. Perronnin, Y. Liu, J. Sanchez, and H. Poirier, Large-scale image retrieval with compressed fisher vectors, in IEEE CVPR, 200. [5] W. Zhou, Y. Lu, H. Li, Y. Song, and Q. Tian, Spatial coding for large scale partial-duplicate web image search, in ACM Multimedia, 200. [6] Z. Wu, Q. Ke, M. Isard, and J. Sun, Bundling features for large scale partial-duplicate web image search, in IEEE CVPR, 2009., 6, 7 [7] J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, Lost in quantization: Improving particular object retrieval in large scale image databases, in IEEE CVPR, 2008., 6, 7 [8] M. Chen, G. Zhang, N. Mitra, X. Huang, and S. Hu, Global contrast based salient region detection, in IEEE CVPR, [9] H. Wu, Y. Wang, K. Feng, T. Wong, T. Lee, and P. Heng, Resizing by symmetry-summarization, ACM Trans. Graph, vol. 29, pp. 9, [0] P. Felzenszwalb and D. Huttenlocher, Efficient graph-based image segmentation, IJCV, [] C. Rother, V. Kolmogorov, and A. Blake, Grabcut: Interactive foreground extraction using iterated graph cuts, ACM Trans. Graph, [2] D. G. Lowe, Distinctive image features from scale invariant keypoints, IJCV, [3] S. Bengio, F. Pereira, Y. Singer, and D. Strelow, Group sparse coding, in NIPS, [4] X. Wang, MingYang, T. Cour, S. Zhu, K. Yu, and T. X. Han, Contextual weighting for vocabulary tree based image retrieval, in IEEE ICCV, 20. 5, 6, 7 [5] X. Shen, Z. Lin, J. Brandt, S. Avidan, and Y. Wu, Object retrieval and localization with spatially-constrained similarity measure and k-nn re-ranking, in IEEE CVPR, [6] D. Qin, S. Gammeter, L. Bossard, T. Quack, and L. VanGool, Hello neighbor: accurate object retrieval with k-reciprocal nearest neighbors, in IEEE CVPR,

Enhanced and Efficient Image Retrieval via Saliency Feature and Visual Attention

Enhanced and Efficient Image Retrieval via Saliency Feature and Visual Attention Enhanced and Efficient Image Retrieval via Saliency Feature and Visual Attention Anand K. Hase, Baisa L. Gunjal Abstract In the real world applications such as landmark search, copy protection, fake image

More information

With the rapid improvement

With the rapid improvement Web-Scale Near-Duplicate Search With the rapid improvement of Internet and multimedia technology, users regularly generate partial-duplicate images for picture sharing, information delivery, and so forth.

More information

ISSN: (Online) Volume 3, Issue 10, October 2015 International Journal of Advance Research in Computer Science and Management Studies

ISSN: (Online) Volume 3, Issue 10, October 2015 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) Volume 3, Issue 10, October 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

Large-scale visual recognition The bag-of-words representation

Large-scale visual recognition The bag-of-words representation Large-scale visual recognition The bag-of-words representation Florent Perronnin, XRCE Hervé Jégou, INRIA CVPR tutorial June 16, 2012 Outline Bag-of-words Large or small vocabularies? Extensions for instance-level

More information

Bundling Features for Large Scale Partial-Duplicate Web Image Search

Bundling Features for Large Scale Partial-Duplicate Web Image Search Bundling Features for Large Scale Partial-Duplicate Web Image Search Zhong Wu, Qifa Ke, Michael Isard, and Jian Sun Microsoft Research Abstract In state-of-the-art image retrieval systems, an image is

More information

WISE: Large Scale Content Based Web Image Search. Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley

WISE: Large Scale Content Based Web Image Search. Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley WISE: Large Scale Content Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1 A picture is worth a thousand words. Query by Images What leaf?

More information

Aggregating Descriptors with Local Gaussian Metrics

Aggregating Descriptors with Local Gaussian Metrics Aggregating Descriptors with Local Gaussian Metrics Hideki Nakayama Grad. School of Information Science and Technology The University of Tokyo Tokyo, JAPAN nakayama@ci.i.u-tokyo.ac.jp Abstract Recently,

More information

Packing and Padding: Coupled Multi-index for Accurate Image Retrieval

Packing and Padding: Coupled Multi-index for Accurate Image Retrieval Packing and Padding: Coupled Multi-index for Accurate Image Retrieval Liang Zheng 1, Shengjin Wang 1, Ziqiong Liu 1, and Qi Tian 2 1 State Key Laboratory of Intelligent Technology and Systems; 1 Tsinghua

More information

SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY. Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang

SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY. Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang State Key Laboratory of Software Development Environment Beihang University, Beijing 100191,

More information

Binary SIFT: Towards Efficient Feature Matching Verification for Image Search

Binary SIFT: Towards Efficient Feature Matching Verification for Image Search Binary SIFT: Towards Efficient Feature Matching Verification for Image Search Wengang Zhou 1, Houqiang Li 2, Meng Wang 3, Yijuan Lu 4, Qi Tian 1 Dept. of Computer Science, University of Texas at San Antonio

More information

arxiv: v1 [cs.cv] 11 Feb 2014

arxiv: v1 [cs.cv] 11 Feb 2014 Packing and Padding: Coupled Multi-index for Accurate Image Retrieval Liang Zheng 1, Shengjin Wang 1, Ziqiong Liu 1, and Qi Tian 2 1 Tsinghua University, Beijing, China 2 University of Texas at San Antonio,

More information

Large scale object/scene recognition

Large scale object/scene recognition Large scale object/scene recognition Image dataset: > 1 million images query Image search system ranked image list Each image described by approximately 2000 descriptors 2 10 9 descriptors to index! Database

More information

Evaluation of GIST descriptors for web scale image search

Evaluation of GIST descriptors for web scale image search Evaluation of GIST descriptors for web scale image search Matthijs Douze Hervé Jégou, Harsimrat Sandhawalia, Laurent Amsaleg and Cordelia Schmid INRIA Grenoble, France July 9, 2009 Evaluation of GIST for

More information

Large Scale Image Retrieval

Large Scale Image Retrieval Large Scale Image Retrieval Ondřej Chum and Jiří Matas Center for Machine Perception Czech Technical University in Prague Features Affine invariant features Efficient descriptors Corresponding regions

More information

Light-Weight Spatial Distribution Embedding of Adjacent Features for Image Search

Light-Weight Spatial Distribution Embedding of Adjacent Features for Image Search Light-Weight Spatial Distribution Embedding of Adjacent Features for Image Search Yan Zhang 1,2, Yao Zhao 1,2, Shikui Wei 3( ), and Zhenfeng Zhu 1,2 1 Institute of Information Science, Beijing Jiaotong

More information

arxiv: v1 [cs.cv] 26 Dec 2013

arxiv: v1 [cs.cv] 26 Dec 2013 Finding More Relevance: Propagating Similarity on Markov Random Field for Image Retrieval Peng Lu a, Xujun Peng b, Xinshan Zhu c, Xiaojie Wang a arxiv:1312.7085v1 [cs.cv] 26 Dec 2013 a Beijing University

More information

IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES

IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES Pin-Syuan Huang, Jing-Yi Tsai, Yu-Fang Wang, and Chun-Yi Tsai Department of Computer Science and Information Engineering, National Taitung University,

More information

An efficient face recognition algorithm based on multi-kernel regularization learning

An efficient face recognition algorithm based on multi-kernel regularization learning Acta Technica 61, No. 4A/2016, 75 84 c 2017 Institute of Thermomechanics CAS, v.v.i. An efficient face recognition algorithm based on multi-kernel regularization learning Bi Rongrong 1 Abstract. A novel

More information

Efficient Representation of Local Geometry for Large Scale Object Retrieval

Efficient Representation of Local Geometry for Large Scale Object Retrieval Efficient Representation of Local Geometry for Large Scale Object Retrieval Michal Perďoch Ondřej Chum and Jiří Matas Center for Machine Perception Czech Technical University in Prague IEEE Computer Society

More information

Compressed local descriptors for fast image and video search in large databases

Compressed local descriptors for fast image and video search in large databases Compressed local descriptors for fast image and video search in large databases Matthijs Douze2 joint work with Hervé Jégou1, Cordelia Schmid2 and Patrick Pérez3 1: INRIA Rennes, TEXMEX team, France 2:

More information

Mixtures of Gaussians and Advanced Feature Encoding

Mixtures of Gaussians and Advanced Feature Encoding Mixtures of Gaussians and Advanced Feature Encoding Computer Vision Ali Borji UWM Many slides from James Hayes, Derek Hoiem, Florent Perronnin, and Hervé Why do good recognition systems go bad? E.g. Why

More information

Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds

Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds 9 1th International Conference on Document Analysis and Recognition Detecting Printed and Handwritten Partial Copies of Line Drawings Embedded in Complex Backgrounds Weihan Sun, Koichi Kise Graduate School

More information

Spatial Coding for Large Scale Partial-Duplicate Web Image Search

Spatial Coding for Large Scale Partial-Duplicate Web Image Search Spatial Coding for Large Scale Partial-Duplicate Web Image Search Wengang Zhou, Yijuan Lu 2, Houqiang Li, Yibing Song, Qi Tian 3 Dept. of EEIS, University of Science and Technology of China, Hefei, P.R.

More information

Hamming embedding and weak geometric consistency for large scale image search

Hamming embedding and weak geometric consistency for large scale image search Hamming embedding and weak geometric consistency for large scale image search Herve Jegou, Matthijs Douze, and Cordelia Schmid INRIA Grenoble, LEAR, LJK firstname.lastname@inria.fr Abstract. This paper

More information

Nagoya University at TRECVID 2014: the Instance Search Task

Nagoya University at TRECVID 2014: the Instance Search Task Nagoya University at TRECVID 2014: the Instance Search Task Cai-Zhi Zhu 1 Yinqiang Zheng 2 Ichiro Ide 1 Shin ichi Satoh 2 Kazuya Takeda 1 1 Nagoya University, 1 Furo-Cho, Chikusa-ku, Nagoya, Aichi 464-8601,

More information

Video annotation based on adaptive annular spatial partition scheme

Video annotation based on adaptive annular spatial partition scheme Video annotation based on adaptive annular spatial partition scheme Guiguang Ding a), Lu Zhang, and Xiaoxu Li Key Laboratory for Information System Security, Ministry of Education, Tsinghua National Laboratory

More information

Three things everyone should know to improve object retrieval. Relja Arandjelović and Andrew Zisserman (CVPR 2012)

Three things everyone should know to improve object retrieval. Relja Arandjelović and Andrew Zisserman (CVPR 2012) Three things everyone should know to improve object retrieval Relja Arandjelović and Andrew Zisserman (CVPR 2012) University of Oxford 2 nd April 2012 Large scale object retrieval Find all instances of

More information

Tag Based Image Search by Social Re-ranking

Tag Based Image Search by Social Re-ranking Tag Based Image Search by Social Re-ranking Vilas Dilip Mane, Prof.Nilesh P. Sable Student, Department of Computer Engineering, Imperial College of Engineering & Research, Wagholi, Pune, Savitribai Phule

More information

Query Adaptive Similarity for Large Scale Object Retrieval

Query Adaptive Similarity for Large Scale Object Retrieval Query Adaptive Similarity for Large Scale Object Retrieval Danfeng Qin Christian Wengert Luc van Gool ETH Zürich, Switzerland {qind,wengert,vangool}@vision.ee.ethz.ch Abstract Many recent object retrieval

More information

Instance Search with Weak Geometric Correlation Consistency

Instance Search with Weak Geometric Correlation Consistency Instance Search with Weak Geometric Correlation Consistency Zhenxing Zhang, Rami Albatal, Cathal Gurrin, and Alan F. Smeaton Insight Centre for Data Analytics School of Computing, Dublin City University

More information

AN EFFECTIVE APPROACH FOR VIDEO COPY DETECTION USING SIFT FEATURES

AN EFFECTIVE APPROACH FOR VIDEO COPY DETECTION USING SIFT FEATURES AN EFFECTIVE APPROACH FOR VIDEO COPY DETECTION USING SIFT FEATURES Miss. S. V. Eksambekar 1 Prof. P.D.Pange 2 1, 2 Department of Electronics & Telecommunication, Ashokrao Mane Group of Intuitions, Wathar

More information

A Novel Method for Image Retrieving System With The Technique of ROI & SIFT

A Novel Method for Image Retrieving System With The Technique of ROI & SIFT A Novel Method for Image Retrieving System With The Technique of ROI & SIFT Mrs. Dipti U.Chavan¹, Prof. P.B.Ghewari² P.G. Student, Department of Electronics & Tele. Comm. Engineering, Ashokrao Mane Group

More information

String distance for automatic image classification

String distance for automatic image classification String distance for automatic image classification Nguyen Hong Thinh*, Le Vu Ha*, Barat Cecile** and Ducottet Christophe** *University of Engineering and Technology, Vietnam National University of HaNoi,

More information

Image Retrieval with a Visual Thesaurus

Image Retrieval with a Visual Thesaurus 2010 Digital Image Computing: Techniques and Applications Image Retrieval with a Visual Thesaurus Yanzhi Chen, Anthony Dick and Anton van den Hengel School of Computer Science The University of Adelaide

More information

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science.

Colorado School of Mines. Computer Vision. Professor William Hoff Dept of Electrical Engineering &Computer Science. Professor William Hoff Dept of Electrical Engineering &Computer Science http://inside.mines.edu/~whoff/ 1 Object Recognition in Large Databases Some material for these slides comes from www.cs.utexas.edu/~grauman/courses/spring2011/slides/lecture18_index.pptx

More information

A Keypoint Descriptor Inspired by Retinal Computation

A Keypoint Descriptor Inspired by Retinal Computation A Keypoint Descriptor Inspired by Retinal Computation Bongsoo Suh, Sungjoon Choi, Han Lee Stanford University {bssuh,sungjoonchoi,hanlee}@stanford.edu Abstract. The main goal of our project is to implement

More information

on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015

on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015 on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015 Vector visual representation Fixed-size image representation High-dim (100 100,000) Generic, unsupervised: BoW,

More information

SHOT-BASED OBJECT RETRIEVAL FROM VIDEO WITH COMPRESSED FISHER VECTORS. Luca Bertinetto, Attilio Fiandrotti, Enrico Magli

SHOT-BASED OBJECT RETRIEVAL FROM VIDEO WITH COMPRESSED FISHER VECTORS. Luca Bertinetto, Attilio Fiandrotti, Enrico Magli SHOT-BASED OBJECT RETRIEVAL FROM VIDEO WITH COMPRESSED FISHER VECTORS Luca Bertinetto, Attilio Fiandrotti, Enrico Magli Dipartimento di Elettronica e Telecomunicazioni, Politecnico di Torino (Italy) ABSTRACT

More information

Learning Affine Robust Binary Codes Based on Locality Preserving Hash

Learning Affine Robust Binary Codes Based on Locality Preserving Hash Learning Affine Robust Binary Codes Based on Locality Preserving Hash Wei Zhang 1,2, Ke Gao 1, Dongming Zhang 1, and Jintao Li 1 1 Advanced Computing Research Laboratory, Beijing Key Laboratory of Mobile

More information

Supplementary Material for Ensemble Diffusion for Retrieval

Supplementary Material for Ensemble Diffusion for Retrieval Supplementary Material for Ensemble Diffusion for Retrieval Song Bai 1, Zhichao Zhou 1, Jingdong Wang, Xiang Bai 1, Longin Jan Latecki 3, Qi Tian 4 1 Huazhong University of Science and Technology, Microsoft

More information

arxiv: v1 [cs.cv] 15 Mar 2014

arxiv: v1 [cs.cv] 15 Mar 2014 Geometric VLAD for Large Scale Image Search Zixuan Wang, Wei Di, Anurag Bhardwaj, Vignesh Jagadeesh, Robinson Piramuthu Dept.of Electrical Engineering, Stanford University, CA 94305 ebay Research Labs,

More information

Image Resizing Based on Gradient Vector Flow Analysis

Image Resizing Based on Gradient Vector Flow Analysis Image Resizing Based on Gradient Vector Flow Analysis Sebastiano Battiato battiato@dmi.unict.it Giovanni Puglisi puglisi@dmi.unict.it Giovanni Maria Farinella gfarinellao@dmi.unict.it Daniele Ravì rav@dmi.unict.it

More information

Speeding up the Detection of Line Drawings Using a Hash Table

Speeding up the Detection of Line Drawings Using a Hash Table Speeding up the Detection of Line Drawings Using a Hash Table Weihan Sun, Koichi Kise 2 Graduate School of Engineering, Osaka Prefecture University, Japan sunweihan@m.cs.osakafu-u.ac.jp, 2 kise@cs.osakafu-u.ac.jp

More information

CLSH: Cluster-based Locality-Sensitive Hashing

CLSH: Cluster-based Locality-Sensitive Hashing CLSH: Cluster-based Locality-Sensitive Hashing Xiangyang Xu Tongwei Ren Gangshan Wu Multimedia Computing Group, State Key Laboratory for Novel Software Technology, Nanjing University xiangyang.xu@smail.nju.edu.cn

More information

By Suren Manvelyan,

By Suren Manvelyan, By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 By Suren Manvelyan,

More information

Object retrieval and localization with spatially-constrained similarity measure and k-nn re-ranking

Object retrieval and localization with spatially-constrained similarity measure and k-nn re-ranking Object retrieval and localization with spatially-constrained similarity measure and k-nn re-ranking Xiaohui Shen1 1 Zhe Lin2 Jonathan Brandt2 2 Northwestern University 2145 Sheridan Road Evanston, L 60208

More information

Dynamic Match Kernel with Deep Convolutional Features for Image Retrieval

Dynamic Match Kernel with Deep Convolutional Features for Image Retrieval IEEE TRANSACTIONS ON IMAGE PROCESSING Match Kernel with Deep Convolutional Features for Image Retrieval Jufeng Yang, Jie Liang, Hui Shen, Kai Wang, Paul L. Rosin, Ming-Hsuan Yang Abstract For image retrieval

More information

Automatic Ranking of Images on the Web

Automatic Ranking of Images on the Web Automatic Ranking of Images on the Web HangHang Zhang Electrical Engineering Department Stanford University hhzhang@stanford.edu Zixuan Wang Electrical Engineering Department Stanford University zxwang@stanford.edu

More information

AN ENHANCED ATTRIBUTE RERANKING DESIGN FOR WEB IMAGE SEARCH

AN ENHANCED ATTRIBUTE RERANKING DESIGN FOR WEB IMAGE SEARCH AN ENHANCED ATTRIBUTE RERANKING DESIGN FOR WEB IMAGE SEARCH Sai Tejaswi Dasari #1 and G K Kishore Babu *2 # Student,Cse, CIET, Lam,Guntur, India * Assistant Professort,Cse, CIET, Lam,Guntur, India Abstract-

More information

Motion illusion, rotating snakes

Motion illusion, rotating snakes Motion illusion, rotating snakes Local features: main components 1) Detection: Find a set of distinctive key points. 2) Description: Extract feature descriptor around each interest point as vector. x 1

More information

Residual Enhanced Visual Vectors for On-Device Image Matching

Residual Enhanced Visual Vectors for On-Device Image Matching Residual Enhanced Visual Vectors for On-Device Image Matching David Chen, Sam Tsai, Vijay Chandrasekhar, Gabriel Takacs, Huizhong Chen, Ramakrishna Vedantham, Radek Grzeszczuk, Bernd Girod Department of

More information

Detecting Multiple Symmetries with Extended SIFT

Detecting Multiple Symmetries with Extended SIFT 1 Detecting Multiple Symmetries with Extended SIFT 2 3 Anonymous ACCV submission Paper ID 388 4 5 6 7 8 9 10 11 12 13 14 15 16 Abstract. This paper describes an effective method for detecting multiple

More information

Aggregated Color Descriptors for Land Use Classification

Aggregated Color Descriptors for Land Use Classification Aggregated Color Descriptors for Land Use Classification Vedran Jovanović and Vladimir Risojević Abstract In this paper we propose and evaluate aggregated color descriptors for land use classification

More information

Recognition of Multiple Characters in a Scene Image Using Arrangement of Local Features

Recognition of Multiple Characters in a Scene Image Using Arrangement of Local Features 2011 International Conference on Document Analysis and Recognition Recognition of Multiple Characters in a Scene Image Using Arrangement of Local Features Masakazu Iwamura, Takuya Kobayashi, and Koichi

More information

Contour-Based Large Scale Image Retrieval

Contour-Based Large Scale Image Retrieval Contour-Based Large Scale Image Retrieval Rong Zhou, and Liqing Zhang MOE-Microsoft Key Laboratory for Intelligent Computing and Intelligent Systems, Department of Computer Science and Engineering, Shanghai

More information

Learning based face hallucination techniques: A survey

Learning based face hallucination techniques: A survey Vol. 3 (2014-15) pp. 37-45. : A survey Premitha Premnath K Department of Computer Science & Engineering Vidya Academy of Science & Technology Thrissur - 680501, Kerala, India (email: premithakpnath@gmail.com)

More information

Specular 3D Object Tracking by View Generative Learning

Specular 3D Object Tracking by View Generative Learning Specular 3D Object Tracking by View Generative Learning Yukiko Shinozuka, Francois de Sorbier and Hideo Saito Keio University 3-14-1 Hiyoshi, Kohoku-ku 223-8522 Yokohama, Japan shinozuka@hvrl.ics.keio.ac.jp

More information

A Bayesian Approach to Hybrid Image Retrieval

A Bayesian Approach to Hybrid Image Retrieval A Bayesian Approach to Hybrid Image Retrieval Pradhee Tandon and C. V. Jawahar Center for Visual Information Technology International Institute of Information Technology Hyderabad - 500032, INDIA {pradhee@research.,jawahar@}iiit.ac.in

More information

Semi-supervised Data Representation via Affinity Graph Learning

Semi-supervised Data Representation via Affinity Graph Learning 1 Semi-supervised Data Representation via Affinity Graph Learning Weiya Ren 1 1 College of Information System and Management, National University of Defense Technology, Changsha, Hunan, P.R China, 410073

More information

Visual Word based Location Recognition in 3D models using Distance Augmented Weighting

Visual Word based Location Recognition in 3D models using Distance Augmented Weighting Visual Word based Location Recognition in 3D models using Distance Augmented Weighting Friedrich Fraundorfer 1, Changchang Wu 2, 1 Department of Computer Science ETH Zürich, Switzerland {fraundorfer, marc.pollefeys}@inf.ethz.ch

More information

Robotics Programming Laboratory

Robotics Programming Laboratory Chair of Software Engineering Robotics Programming Laboratory Bertrand Meyer Jiwon Shin Lecture 8: Robot Perception Perception http://pascallin.ecs.soton.ac.uk/challenges/voc/databases.html#caltech car

More information

Robot localization method based on visual features and their geometric relationship

Robot localization method based on visual features and their geometric relationship , pp.46-50 http://dx.doi.org/10.14257/astl.2015.85.11 Robot localization method based on visual features and their geometric relationship Sangyun Lee 1, Changkyung Eem 2, and Hyunki Hong 3 1 Department

More information

Multimodal Medical Image Retrieval based on Latent Topic Modeling

Multimodal Medical Image Retrieval based on Latent Topic Modeling Multimodal Medical Image Retrieval based on Latent Topic Modeling Mandikal Vikram 15it217.vikram@nitk.edu.in Suhas BS 15it110.suhas@nitk.edu.in Aditya Anantharaman 15it201.aditya.a@nitk.edu.in Sowmya Kamath

More information

Discriminative classifiers for image recognition

Discriminative classifiers for image recognition Discriminative classifiers for image recognition May 26 th, 2015 Yong Jae Lee UC Davis Outline Last time: window-based generic object detection basic pipeline face detection with boosting as case study

More information

DescriptorEnsemble: An Unsupervised Approach to Image Matching and Alignment with Multiple Descriptors

DescriptorEnsemble: An Unsupervised Approach to Image Matching and Alignment with Multiple Descriptors DescriptorEnsemble: An Unsupervised Approach to Image Matching and Alignment with Multiple Descriptors 林彥宇副研究員 Yen-Yu Lin, Associate Research Fellow 中央研究院資訊科技創新研究中心 Research Center for IT Innovation, Academia

More information

Indexing local features and instance recognition May 16 th, 2017

Indexing local features and instance recognition May 16 th, 2017 Indexing local features and instance recognition May 16 th, 2017 Yong Jae Lee UC Davis Announcements PS2 due next Monday 11:59 am 2 Recap: Features and filters Transforming and describing images; textures,

More information

Mobile visual search. Research Online. University of Wollongong. Huiguang Sun University of Wollongong. Recommended Citation

Mobile visual search. Research Online. University of Wollongong. Huiguang Sun University of Wollongong. Recommended Citation University of Wollongong Research Online University of Wollongong Thesis Collection University of Wollongong Thesis Collections 2013 Mobile visual search Huiguang Sun University of Wollongong Recommended

More information

Published in: MM '10: proceedings of the ACM Multimedia 2010 International Conference: October 25-29, 2010, Firenze, Italy

Published in: MM '10: proceedings of the ACM Multimedia 2010 International Conference: October 25-29, 2010, Firenze, Italy UvA-DARE (Digital Academic Repository) Landmark Image Retrieval Using Visual Synonyms Gavves, E.; Snoek, C.G.M. Published in: MM '10: proceedings of the ACM Multimedia 2010 International Conference: October

More information

III. VERVIEW OF THE METHODS

III. VERVIEW OF THE METHODS An Analytical Study of SIFT and SURF in Image Registration Vivek Kumar Gupta, Kanchan Cecil Department of Electronics & Telecommunication, Jabalpur engineering college, Jabalpur, India comparing the distance

More information

Robust Shape Retrieval Using Maximum Likelihood Theory

Robust Shape Retrieval Using Maximum Likelihood Theory Robust Shape Retrieval Using Maximum Likelihood Theory Naif Alajlan 1, Paul Fieguth 2, and Mohamed Kamel 1 1 PAMI Lab, E & CE Dept., UW, Waterloo, ON, N2L 3G1, Canada. naif, mkamel@pami.uwaterloo.ca 2

More information

Fuzzy based Multiple Dictionary Bag of Words for Image Classification

Fuzzy based Multiple Dictionary Bag of Words for Image Classification Available online at www.sciencedirect.com Procedia Engineering 38 (2012 ) 2196 2206 International Conference on Modeling Optimisation and Computing Fuzzy based Multiple Dictionary Bag of Words for Image

More information

A 2-D Histogram Representation of Images for Pooling

A 2-D Histogram Representation of Images for Pooling A 2-D Histogram Representation of Images for Pooling Xinnan YU and Yu-Jin ZHANG Department of Electronic Engineering, Tsinghua University, Beijing, 100084, China ABSTRACT Designing a suitable image representation

More information

2 Proposed Methodology

2 Proposed Methodology 3rd International Conference on Multimedia Technology(ICMT 2013) Object Detection in Image with Complex Background Dong Li, Yali Li, Fei He, Shengjin Wang 1 State Key Laboratory of Intelligent Technology

More information

Efficient Re-ranking in Vocabulary Tree based Image Retrieval

Efficient Re-ranking in Vocabulary Tree based Image Retrieval Efficient Re-ranking in Vocabulary Tree based Image Retrieval Xiaoyu Wang 2, Ming Yang 1, Kai Yu 1 1 NEC Laboratories America, Inc. 2 Dept. of ECE, Univ. of Missouri Cupertino, CA 95014 Columbia, MO 65211

More information

Similar Fragment Retrieval of Animations by a Bag-of-features Approach

Similar Fragment Retrieval of Animations by a Bag-of-features Approach Similar Fragment Retrieval of Animations by a Bag-of-features Approach Weihan Sun, Koichi Kise Dept. Computer Science and Intelligent Systems Osaka Prefecture University Osaka, Japan sunweihan@m.cs.osakafu-u.ac.jp,

More information

Unsupervised Saliency Estimation based on Robust Hypotheses

Unsupervised Saliency Estimation based on Robust Hypotheses Utah State University DigitalCommons@USU Computer Science Faculty and Staff Publications Computer Science 3-2016 Unsupervised Saliency Estimation based on Robust Hypotheses Fei Xu Utah State University,

More information

Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval O. Chum, et al.

Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval O. Chum, et al. Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval O. Chum, et al. Presented by Brandon Smith Computer Vision Fall 2007 Objective Given a query image of an object,

More information

Large Scale 3D Reconstruction by Structure from Motion

Large Scale 3D Reconstruction by Structure from Motion Large Scale 3D Reconstruction by Structure from Motion Devin Guillory Ziang Xie CS 331B 7 October 2013 Overview Rome wasn t built in a day Overview of SfM Building Rome in a Day Building Rome on a Cloudless

More information

Randomized Visual Phrases for Object Search

Randomized Visual Phrases for Object Search Randomized Visual Phrases for Object Search Yuning Jiang Jingjing Meng Junsong Yuan School of EEE, Nanyang Technological University, Singapore {ynjiang, jingjing.meng, jsyuan}@ntu.edu.sg Abstract Accurate

More information

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,

More information

Adaptive Feature Selection via Boosting-like Sparsity Regularization

Adaptive Feature Selection via Boosting-like Sparsity Regularization Adaptive Feature Selection via Boosting-like Sparsity Regularization Libin Wang, Zhenan Sun, Tieniu Tan Center for Research on Intelligent Perception and Computing NLPR, Beijing, China Email: {lbwang,

More information

Graph Matching Iris Image Blocks with Local Binary Pattern

Graph Matching Iris Image Blocks with Local Binary Pattern Graph Matching Iris Image Blocs with Local Binary Pattern Zhenan Sun, Tieniu Tan, and Xianchao Qiu Center for Biometrics and Security Research, National Laboratory of Pattern Recognition, Institute of

More information

CRF Based Point Cloud Segmentation Jonathan Nation

CRF Based Point Cloud Segmentation Jonathan Nation CRF Based Point Cloud Segmentation Jonathan Nation jsnation@stanford.edu 1. INTRODUCTION The goal of the project is to use the recently proposed fully connected conditional random field (CRF) model to

More information

6.819 / 6.869: Advances in Computer Vision

6.819 / 6.869: Advances in Computer Vision 6.819 / 6.869: Advances in Computer Vision Image Retrieval: Retrieval: Information, images, objects, large-scale Website: http://6.869.csail.mit.edu/fa15/ Instructor: Yusuf Aytar Lecture TR 9:30AM 11:00AM

More information

A Rapid Automatic Image Registration Method Based on Improved SIFT

A Rapid Automatic Image Registration Method Based on Improved SIFT Available online at www.sciencedirect.com Procedia Environmental Sciences 11 (2011) 85 91 A Rapid Automatic Image Registration Method Based on Improved SIFT Zhu Hongbo, Xu Xuejun, Wang Jing, Chen Xuesong,

More information

A REVIEW ON IMAGE RETRIEVAL USING HYPERGRAPH

A REVIEW ON IMAGE RETRIEVAL USING HYPERGRAPH A REVIEW ON IMAGE RETRIEVAL USING HYPERGRAPH Sandhya V. Kawale Prof. Dr. S. M. Kamalapur M.E. Student Associate Professor Deparment of Computer Engineering, Deparment of Computer Engineering, K. K. Wagh

More information

Learning a Fine Vocabulary

Learning a Fine Vocabulary Learning a Fine Vocabulary Andrej Mikulík, Michal Perdoch, Ondřej Chum, and Jiří Matas CMP, Dept. of Cybernetics, Faculty of EE, Czech Technical University in Prague Abstract. We present a novel similarity

More information

Image Analysis & Retrieval. CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W Lec 18.

Image Analysis & Retrieval. CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W Lec 18. Image Analysis & Retrieval CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W 4-5:15pm@Bloch 0012 Lec 18 Image Hashing Zhu Li Dept of CSEE, UMKC Office: FH560E, Email: lizhu@umkc.edu, Ph:

More information

THIS paper considers the task of accurate image retrieval

THIS paper considers the task of accurate image retrieval JOURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 1 Coarse2Fine: Two-layer Fusion for Image Retrieval le Dong, Gaipeng Kong, Wenpu Dong, Liang Zheng and Qi Tian arxiv:1607.00719v1 [cs.mm] 4 Jul

More information

Object of interest discovery in video sequences

Object of interest discovery in video sequences Object of interest discovery in video sequences A Design Project Report Presented to Engineering Division of the Graduate School Of Cornell University In Partial Fulfillment of the Requirements for the

More information

Image retrieval based on bag of images

Image retrieval based on bag of images University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2009 Image retrieval based on bag of images Jun Zhang University of Wollongong

More information

A Systems View of Large- Scale 3D Reconstruction

A Systems View of Large- Scale 3D Reconstruction Lecture 23: A Systems View of Large- Scale 3D Reconstruction Visual Computing Systems Goals and motivation Construct a detailed 3D model of the world from unstructured photographs (e.g., Flickr, Facebook)

More information

Recognition. Topics that we will try to cover:

Recognition. Topics that we will try to cover: Recognition Topics that we will try to cover: Indexing for fast retrieval (we still owe this one) Object classification (we did this one already) Neural Networks Object class detection Hough-voting techniques

More information

Semantic-aware Co-indexing for Image Retrieval

Semantic-aware Co-indexing for Image Retrieval 2013 IEEE International Conference on Computer Vision Semantic-aware Co-indexing for Image Retrieval Shiliang Zhang 2,MingYang 1,XiaoyuWang 1, Yuanqing Lin 1,QiTian 2 1 NEC Laboratories America, Inc. 2

More information

A novel template matching method for human detection

A novel template matching method for human detection University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 2009 A novel template matching method for human detection Duc Thanh Nguyen

More information

Large-scale visual recognition Efficient matching

Large-scale visual recognition Efficient matching Large-scale visual recognition Efficient matching Florent Perronnin, XRCE Hervé Jégou, INRIA CVPR tutorial June 16, 2012 Outline!! Preliminary!! Locality Sensitive Hashing: the two modes!! Hashing!! Embedding!!

More information

Object Detection Using Segmented Images

Object Detection Using Segmented Images Object Detection Using Segmented Images Naran Bayanbat Stanford University Palo Alto, CA naranb@stanford.edu Jason Chen Stanford University Palo Alto, CA jasonch@stanford.edu Abstract Object detection

More information

EFFECTIVE FISHER VECTOR AGGREGATION FOR 3D OBJECT RETRIEVAL

EFFECTIVE FISHER VECTOR AGGREGATION FOR 3D OBJECT RETRIEVAL EFFECTIVE FISHER VECTOR AGGREGATION FOR 3D OBJECT RETRIEVAL Jean-Baptiste Boin, André Araujo, Lamberto Ballan, Bernd Girod Department of Electrical Engineering, Stanford University, CA Media Integration

More information

THE matching of local visual features plays a critical role

THE matching of local visual features plays a critical role 1748 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 24, NO. 6, JUNE 2015 Randomized Spatial Context for Object Search Yuning Jiang, Jingjing Meng, Member, IEEE, Junsong Yuan, Senior Member, IEEE, and Jiebo

More information

Appearance-Based Place Recognition Using Whole-Image BRISK for Collaborative MultiRobot Localization

Appearance-Based Place Recognition Using Whole-Image BRISK for Collaborative MultiRobot Localization Appearance-Based Place Recognition Using Whole-Image BRISK for Collaborative MultiRobot Localization Jung H. Oh, Gyuho Eoh, and Beom H. Lee Electrical and Computer Engineering, Seoul National University,

More information