ROBUST SCENE CLASSIFICATION BY GIST WITH ANGULAR RADIAL PARTITIONING. Wei Liu, Serkan Kiranyaz and Moncef Gabbouj

Size: px
Start display at page:

Download "ROBUST SCENE CLASSIFICATION BY GIST WITH ANGULAR RADIAL PARTITIONING. Wei Liu, Serkan Kiranyaz and Moncef Gabbouj"

Transcription

1 Proceedings of the 5th International Symposium on Communications, Control and Signal Processing, ISCCSP 2012, Rome, Italy, 2-4 May 2012 ROBUST SCENE CLASSIFICATION BY GIST WITH ANGULAR RADIAL PARTITIONING Wei Liu, Serkan Kiranyaz and Moncef Gabbouj Department of Signal Processing, Tampere University of Technology, Tampere, Finland ABSTRACT Natural scene recognition and classification have received considerable attention in the computer vision community due to its challenging nature. Significant intra-class variations have largely limited the accuracy of scene categorization tasks: a holistic representation forces matching in strict spatial confinement; whereas a bag of features representation ignores the order or spatial layout of the scene completely, resulting in a loss of scene logic. In this paper, we present a novel method, called ARP (Angular Radial Partitioning) Gist, to classify the scene. Experiments show that the proposed method has improved recognition accuracy by better representing the structure in a scene and striking a balance between spatial confinement and freedom. Index Terms scene classification, angular radial partitioning, scene gist 1. INTRODUCTION Recent advances in the field of computer vision have spanned the approach to understanding the semantics of natural scene images into two directions: global representation and an orderless bag of features (BOF) model [13]. The former, proposed in [10], attempts to capture the gist of a scene without object segmentation and recognition. The Gist descriptor is a low-dimensional representation of the attributes of a scene, namely naturalness, openness, roughness, expansion and ruggedness. The scene classification paradigm based on this holistic perspective on natural scene images is later compared to human performance in a rapid scene classification experiment [4], the result of which has provided evidence that the concept of representing the global attributes of a scene is in parallel to the human visual and cognitive system. Further experiments [5] in psychology and cognitive science suggest that a mere 50ms on average can be sufficient for scene recognition. These findings explain why the Gist descriptor performs remarkably well on scene recognition tasks, (especially on outdoor categories,) with applications extending to place recognition [12]. By dividing the image into an N-by-N grid, however, the Gist descriptor imposes strong constraints on spatial layout, and yet fails to delineate the spatial structure in each block. Consequently, mismatch occurs due to averaging operation in individual blocks. On the other end of the spectrum is the BOF model. Inspired by the bag of words model in text categorization, this paradigm represents each image as an occurrence histogram of visual words that are local descriptors of regions or patches in the image. The SIFT descriptor [8] has been widely used as a local feature for the BOF model. The first stage to compute the SIFT (Scale Invariant Feature Transform) descriptor is to detect interest points that are repeatable under moderate local transformations [9]. Then a descriptor is generated in an image patch around the interest point. This powerful descriptor is highly discriminative and invariant to scale, clutter and partial occlusion, change in illumination and view point, etc. These patch features are generated from images to form the codebook using the k- means algorithm, with the centroids of the clusters representing the visual vocabulary. Then an image can be perceived as the occurrence counts of each visual word in the vocabulary resulting in an orderless representation of the scene. To ensure the accuracy of classification, the codebook should be large enough so that each image can be properly represented by the histogram. Due to significant intra-class variations, such requirement is not easily satisfied. Furthermore, the codebook-building process is often computationally intensive, which limits the efficiency of its application. But the most prominent weakness arises from the absence of logic of the scene due to complete ignorance of spatial layout. We argue that the logic of a scene is essential to its recognition and classification while the computational cost imposed by the BOF model is highly undesirable. In order to better capture the shape characteristics of objects and the spatial structure in a block of an image, in this paper, we propose a novel algorithm that not only delineates the structures within a block, but also provides leeway for spatial freedom. The paper is organized as follows. Section 2 briefly summarizes the algorithm of the original Gist descriptor, followed by a complete introduction of the proposed method, with important elements discussed in detail. Parameters used in feature extraction and classification experiments are described in Section 3. Section 4 reports experimental results with comparison to other well-known implementations. Section 5 concludes the paper /12/$ IEEE

2 2. THE PROPOSED ALGORITHM In this section, we shall present the proposed ARP (Angular Radial Partitioning) Gist technique in detail, followed by explicit theoretical justifications Implementation Procedure The proposed algorithm is built upon the implementation of the original Gist descriptor 1, which is summarized in the following: Original Gist Descriptor First, a grayscale image is pre-processed by a whitening filter to preserve dominant structural details and then normalized with respect to local contrast. The pre-processed image is then passed through a cascade of Gabor filters (Figure 1) in S scales with O orientations at each scale. Each of these S O images (orientation maps), representing the original image at one orientation in each scale, is then divided into an N-by-N grid. Within each block on the grid, the average intensity is calculated to represent the feature in that block. The final output is a concatenated feature vector of S O N N dimensions. Figure 2: Flowchart of the original Gist and ARP Gist Angular Radial Partitioning Figure 1: Gabor Filter (4 Scales, 8 Orientations per Scale) ARP Gist Instead of taking the average value within each block on the N-by-N grid, we further partition each block into A bins using Angular Radial Partitioning (ARP) [2]. To avoid overpartitioning, only angular partitioning is considered; in other words, the number of radial partitioning is set to 1 for all blocks. Then the average intensity level is calculated in each angular bin, followed by a 1-D discrete Fourier transform on the angular bins in each block and then taking the magnitude of the coefficients to achieve positional invariance. Finally, the feature vector is obtained by concatenating all the DFT transformed bins in the image across all the orientations and scales, resulting in an S O N N A dimensional feature vector. Figure 2 shows the complete block diagram for the proposed method. Note that the implementation procedure of the original Gist (circled in read) is also included in the figure. 1 ARP has been successfully applied in content-based image retrieval (CBIR), sketch-based image retrieval (SBIR) applications [3] and object recognition [1]. It employs both angular and radial partitioning that is similar to the polar coordinate system. One main advantage of ARP is its ability to capture intricate structures in an angular-spatial manner, as opposed to a simple spatial distribution in a rectangular partitioning scheme. Figure 3 shows a typical ARP strategy. Figure 3: Angular Radial Partitioning Spatial layout is an important part of a scene image as it carries essential information regarding its category. In order to preserve relative spatial layout of a scene image and allow moderate intra-class variations in scenes from each class, i.e., the presence of the stove can be in the middle of the image or on the left center of the image, the Gist

3 descriptor is computed on an N-by-N grid. Even though this coarse partitioning scheme has yielded significant success in terms of recognition accuracy in scene classification, it fails to represent spatial structures efficiently within a block as the averaging operator often renders different structures indistinguishable, resulting in mismatch among scene categories. Figure 4 shows an example of such a deficiency. It is clear that even though the spatial structures are visually distinct for human observers, the Gist feature vectors cannot really discriminate between the two distinct images. Figure 6 shows the same example in Figure 4 but with additional ARP. Since these two blocks are divided into 4 additional angular bins, the dissimilarity between the two resulting feature vectors becomes significant enough to distinguish the two different structures. Figure 6: ARP Gist in a block: a simple spatial structure in a block and the corresponding ARP Gist feature vector; another spatial structure and the corresponding ARP Gist feature vector Figure 4: Limitations of the Gist descriptor: a simple spatial structure in a block and the corresponding Gist feature vector; another spatial structure and the corresponding Gist feature vector. For a better representation of the spatial structures of a scene image, we propose a strategy that builds on the success of the original Gist feature. In addition to the N-by- N rectangular partitioning (Figure 5 ), we further divide each block using ARP into A angular bins, which not only extracts the coarse spatial layout but also the finer angular layout in a scene image. To avoid coincidence with further rectangular partitioning, we use the upper right diagonal as the starter of ARP in each block, as illustrated in Figure 5. Figure 5: Demonstration of rectangular partitioning and ARP: image partitioned on a 4-by-4 grid, angular partitioning in addition to the original rectangular partitioning (A=8) Positional Invariance Even though ARP can better delineate the spatial structure in a block, the risk is the same with other type of blocks: over-partitioning. The idea of dividing an image into blocks is to preserve some spatial layout in the process of recognition or matching. Finer partitioning means stricter layout confinement, which is not the case for different scene images in the same category. This is the reason why the original Gist descriptor is calculated on a 4-by-4 grid instead of an 8-by-8 one. Experiments (see Section 4 for results) have shown that over-partitioning will not improve classification accuracy, and sometimes may even induce accuracy erosion. This is also true with ARP. Further dividing the 4-by-4 grid can sometimes degrade the leeway gained by better representing the structure since the same spatial structures in different scene images within the same category often enjoy spatial freedom within an area of the image, i.e., a computer can be at different positions along the surface of the desk. In light of such dilemma, the proposed method utilizes the discrete Fourier transform to achieve rotational or positional invariance. Let I denote an image block and A denote the number of angular partitioning. The angle in each bin can be calculated asθ = 2 π /A. Then the i th element of the feature vector of one block can be formulated as follows: 1 (1)

4 for i = 0,1,2... A 1, where S is the total number of pixels that fall into each bin. If the block is rotated counterclockwise τ = l2 π / A radian ( l = 0,1,2... A 1 ) around the center of the block, then the image block, denoted as I τ, can be represented by the following equation: (2) Through simple mathematical deduction, we can find the relationship between fτ () i and f () i : (3) Clearly fτ () i and f () i are not the same. But with a simple 1-D discrete Fourier transform, the similarity of the two can be easily observed. After applying DFT to f () i and, we obtain: f () i τ 1 / 1 / 1 / 1 / (4) (5) (6) (7) / (8) According to equation (8), the DFT of the rotated feature vector is just multiplied by a certain angle to that of the original one. Note that the magnitudes of the DFT of both feature vectors are the same, that is Fτ ( u) = F( u). Therefore, we use the magnitude of 1- D DFT coefficients to achieve rotational invariance so that regardless of the angular position of structural details, the same structures always render the same feature vectors. Figure 7 shows a simple example of such transformation. In the figure, shows two identical spatial structures at different positions in two image blocks. Without DFT, the two feature vectors, shown in and generated using ARP, are visually distinct from each other. But with a simple 1-D DFT and taking the magnitude of the coefficients (Figure 7 (c)), the two feature vectors are virtually the same, which exhibits the fact that our descriptor is position invariant. (c) Figure 7: An example of rotational invariance. same structures at different locations in two image blocks; feature vectors without DFT; (c) feature vectors with DFT 3. FEATURE EXTRACTION AND TEST SETTINGS 3.1. Image Normalization Since the algorithm is based on the spatial structures within scene images, we consider only the luminance component, for which we use the mean of the R, G, B channels. In order to ensure comparability, all images are resized to the resolution of using bilinear interpolation; therefore the aspect ratio of each image is ignored. This is in line with the experimental setup used by Oliva et al. in their implementation Parameter Settings for Feature Extraction The parameters for image pre-processing (image whitening and local contrast normalization) are kept the same with the original Gist, and so are the parameters for Gabor filter. The images are filtered by Gabor filter at 4 scales, with 8 orientation channels at each scale. For the original Gist descriptor, each image is divided into N N (N=4,8) blocks and the average is taken in each block. Hence, the total dimension of the feature vector for each image is 4 8 N N=32N 2 for the original Gist descriptor. ARP is applied to each block on a 4-by-4 grid. The number of angular partitioning (A) used in our experiment is 3, 4, 5 and 6 respectively to evaluate the performance of ARP Gist. In each angular bin, we take the average value to represent the feature of that bin, resulting in a feature vector of size A=512A.

5 3.3. Classifier Training SVM training and testing are conducted 1000 times so that generality can be achieved. We randomly select 100 images in each category for training and the rest for testing. This processing is done 1000 times to ensure effective comparison between the proposed algorithm and the original Gist descriptor. Note that the comparison is based on the same 1000 sets of training and testing data. In our experiment, we use Gaussian Radial Basis Function as the kernel to build one-versus-all classifiers, which has the following form:, exp (9) The scaling factor γ in equation (9) is defined in our experiment as the following: 1 (10) where p is the kernel parameter, which is set to in all our experiments, and f is the number of dimensions of the feature vector. The confusion matrix for every training/testing set is recorded during each run. The final classification accuracy is the average value of the mean of the confusion matrix diagonal. gain in the performance of the proposed method is not simply as a result of further partitioning On the Spatial Envelope Dataset The MIT spatial envelope dataset is the testbed for the original Gist descriptor. It consists of 8 outdoor scene categories: coast, mountain, forest, open country, street, inside city, tall buildings and highways. There are 2688 color images in total and approximately 300 images in each category. Notably, each image has the same resolution of Figure 8 shows some sample images from the dataset, with one from each category. Table 1: Comparison of classification accuracy on MIT dataset. Method Original Gist ARP Gist Classification Accuracy N= ± N= ± A= ± A= ± A= ± A= ± A= ± As the results summarized in Table 1 indicate, the average classification accuracy obtained by the original Gist descriptor (N=4) is %, with a standard deviation of Note that this is slightly lower than the 83.7% reported by Oliva et al. because of different training configurations: in our experiment, we have selected 1000 different training/testing configurations in SVM to evaluate the average performance. In contrast, the proposed ARP Gist has shown improvement over the original, with the best configuration (A=3) yielding a classification accuracy of %. To show the validity of the improvement, we have also tested the original Gist on an 8-by-8 (N=8) grid (resulting in a feature vector of 2048 dimensions, equivalent to ARP Gist when A is set to 4), which results in % accuracy. This is in parallel to the 4-by-4 grid Gist with almost intelligible accuracy improvement. As observed in Table 1, the proposed algorithm has the superiority in terms of classification accuracy. Figure 8: Sample images from the MIT spatial envelope dataset. 4. EXPERIMENTAL RESULTS In this section, we report the performance of ARP Gist under different configurations in comparison with the original Gist descriptor with varying value of N based on two publicly available datasets: the MIT spatial envelope dataset [10] and UIUC 15 scene category dataset [7]. The reason for varying N in the original Gist is to ensure that any Figure 9: Sample images from the additional categories of the UIUC 15 scene category dataset.

6 4.2. On the 15 Scene Category Dataset The UIUC 15 scene category dataset is an extension to the original spatial envelope dataset. It contains not only all the outdoor scene images shown in Figure 8 (all of the MIT spatial envelope dataset images are presented in grayscale), but also some additional indoor and outdoor categories: suburb, office, kitchen, living room, store, bedroom and industrial. Most additional images are in grayscale. The resolution and aspect ratio of the additional images vary within each category and among categories also. Figure 9 shows the sample images from additional categories. Table 2: Comparison of classification accuracy on 15 scene category dataset. Method Classification Accuracy Original Gist N= ± N= ± A= ± ARP Gist A= ± A= ± A= ± A= ± BOF M= ±0.6 M= ±0.3 In this dataset, the classification accuracy achieved by the original Gist is only % with a standard deviation of In contrast to the previous dataset, the overpartitioned original Gist (8-by-8 grid) has suffered a slight accuracy erosion, with a classification rate of %. On the other hand, the proposed ARP Gist has yielded classification rates above 74%. The best result ( %) is obtained when the number of angular partitioning is set to 4. Note that the feature vector dimension in this configuration is the same with 8-by-8 grid of the original Gist. To show the significance of performance improvement obtained by ARP Gist, we summarize the classification rates using BOF algorithm [6] in Table 2, along with the standard results. The BOF feature is based on image patches on a densely sampled grid, without the usual process of interest point detection. The SIFT descriptor is calculated on each image patch. The experiment is conducted on 200 (M=200) and 400 (M=400) vocabulary size models. It is evident that even without building the codebook, saving significant computational cost, ARP Gist is still superior to the BOF model. (i.e., note that images used in BOF model are not normalized to resolution. If normalized, the model will suffer significant accuracy degradation [11].) 5. CONCLUSION This paper presents a novel approach for scene representation. Built on the original Gist descriptor, the proposed ARP Gist descriptor utilizes the effectiveness of angular partitioning to capture the finer details of scene images. With the DFT transform and magnitude of its coefficients, the ARP Gist allows positional invariance of scene structures within a rectangular block. The proposed method not only preserves rough spatial layout, but also provides flexibility in each block, achieving a balance between spatial constraints and freedom. Experiments on two datasets have shown that the proposed method is superior to the original Gist and rivals the state-of-the-art BOF model in terms of classification accuracy and computational cost. 6. REFERENCES [1] S. Belongie and J. Malik, Shape Matching and Object Recognition Using Shape Contexts, IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(4): , [2] A. Chalechale, A. Mertins and G. Naghdy, Edge Image Description Using Angular Radial Partitioning, IEE Proceedings - Vision, Image and Signal Processing, 151(2):93-101, [3] A. Chalechale, G. Naghdy and A Mertins, Sketch-based Image Matching Using Angular Partitioning, IEEE Transactions on Systems, Man, and Cybernetics, 35(1):28 41, [4] M. R. Greene and A. Oliva, Recognition of Natural Scenes from Global Properties: Seeing the Forest without Representing the Trees, Cognitive Psychology, 58(2): , 2009 [5] M. R. Greene and A. Oliva, The Briefest of Glances: The Time Course of Natural Scene Understanding, Psychological Science, 20: , 2009 [6] S. Lazebnik, C. Schmid, and J. Ponce, Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories, IEEE Conference on Computer Vision and Pattern Recognition, 2: , [7] L. Fei-Fei and P. Perona, A Bayesian Hierarchical Model for Learning Natural Scene Categories, IEEE Conference on Computer Vision and Pattern Recognition, 2: , [8] D. G. Lowe, Distinctive Image Features from Scale-invariant Keypoints, International Journal of Computer Vision, 60(2):91 110, [9] K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir and L.V. Gool, A Comparison of Affine Region Detectors, International Journal of Computer Vision, 65(1 2):43 72, [10] A. Oliva and A. Torralba, Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope, International Journal of Computer Vision, 42(3): , [11] A. Quattoni and A. Torralba, Recognizing Indoor Scenes, IEEE Conference on Computer Vision and Pattern Recognition, , [12] C. Siagian and L. Itti, Rapid Biologically-inspired Scene Classification Using Features Shared with Visual Attention, IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(2): , [13] J. Sivic and A. Zisserman, Video Google: A Text Retrieval Approach to Object Matching in Videos, International Conference on Computer Vision, 2: , 2003.

Beyond Bags of Features

Beyond Bags of Features : for Recognizing Natural Scene Categories Matching and Modeling Seminar Instructed by Prof. Haim J. Wolfson School of Computer Science Tel Aviv University December 9 th, 2015

More information

Improved Spatial Pyramid Matching for Image Classification

Improved Spatial Pyramid Matching for Image Classification Improved Spatial Pyramid Matching for Image Classification Mohammad Shahiduzzaman, Dengsheng Zhang, and Guojun Lu Gippsland School of IT, Monash University, Australia {Shahid.Zaman,Dengsheng.Zhang,Guojun.Lu}@monash.edu

More information

arxiv: v3 [cs.cv] 3 Oct 2012

arxiv: v3 [cs.cv] 3 Oct 2012 Combined Descriptors in Spatial Pyramid Domain for Image Classification Junlin Hu and Ping Guo arxiv:1210.0386v3 [cs.cv] 3 Oct 2012 Image Processing and Pattern Recognition Laboratory Beijing Normal University,

More information

Artistic ideation based on computer vision methods

Artistic ideation based on computer vision methods Journal of Theoretical and Applied Computer Science Vol. 6, No. 2, 2012, pp. 72 78 ISSN 2299-2634 http://www.jtacs.org Artistic ideation based on computer vision methods Ferran Reverter, Pilar Rosado,

More information

Comparing Local Feature Descriptors in plsa-based Image Models

Comparing Local Feature Descriptors in plsa-based Image Models Comparing Local Feature Descriptors in plsa-based Image Models Eva Hörster 1,ThomasGreif 1, Rainer Lienhart 1, and Malcolm Slaney 2 1 Multimedia Computing Lab, University of Augsburg, Germany {hoerster,lienhart}@informatik.uni-augsburg.de

More information

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011

Previously. Part-based and local feature models for generic object recognition. Bag-of-words model 4/20/2011 Previously Part-based and local feature models for generic object recognition Wed, April 20 UT-Austin Discriminative classifiers Boosting Nearest neighbors Support vector machines Useful for object recognition

More information

Bag-of-features. Cordelia Schmid

Bag-of-features. Cordelia Schmid Bag-of-features for category classification Cordelia Schmid Visual search Particular objects and scenes, large databases Category recognition Image classification: assigning a class label to the image

More information

Evaluation and comparison of interest points/regions

Evaluation and comparison of interest points/regions Introduction Evaluation and comparison of interest points/regions Quantitative evaluation of interest point/region detectors points / regions at the same relative location and area Repeatability rate :

More information

Mining Discriminative Adjectives and Prepositions for Natural Scene Recognition

Mining Discriminative Adjectives and Prepositions for Natural Scene Recognition Mining Discriminative Adjectives and Prepositions for Natural Scene Recognition Bangpeng Yao 1, Juan Carlos Niebles 2,3, Li Fei-Fei 1 1 Department of Computer Science, Princeton University, NJ 08540, USA

More information

Part-based and local feature models for generic object recognition

Part-based and local feature models for generic object recognition Part-based and local feature models for generic object recognition May 28 th, 2015 Yong Jae Lee UC Davis Announcements PS2 grades up on SmartSite PS2 stats: Mean: 80.15 Standard Dev: 22.77 Vote on piazza

More information

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao

Classifying Images with Visual/Textual Cues. By Steven Kappes and Yan Cao Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao Motivation Image search Building large sets of classified images Robotics Background Object recognition is unsolved Deformable shaped

More information

Scene Recognition using Bag-of-Words

Scene Recognition using Bag-of-Words Scene Recognition using Bag-of-Words Sarthak Ahuja B.Tech Computer Science Indraprastha Institute of Information Technology Okhla, Delhi 110020 Email: sarthak12088@iiitd.ac.in Anchita Goel B.Tech Computer

More information

Object Recognition. Computer Vision. Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce

Object Recognition. Computer Vision. Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce Object Recognition Computer Vision Slides from Lana Lazebnik, Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce How many visual object categories are there? Biederman 1987 ANIMALS PLANTS OBJECTS

More information

Patch Descriptors. CSE 455 Linda Shapiro

Patch Descriptors. CSE 455 Linda Shapiro Patch Descriptors CSE 455 Linda Shapiro How can we find corresponding points? How can we find correspondences? How do we describe an image patch? How do we describe an image patch? Patches with similar

More information

Local Features and Bag of Words Models

Local Features and Bag of Words Models 10/14/11 Local Features and Bag of Words Models Computer Vision CS 143, Brown James Hays Slides from Svetlana Lazebnik, Derek Hoiem, Antonio Torralba, David Lowe, Fei Fei Li and others Computer Engineering

More information

Visual words. Map high-dimensional descriptors to tokens/words by quantizing the feature space.

Visual words. Map high-dimensional descriptors to tokens/words by quantizing the feature space. Visual words Map high-dimensional descriptors to tokens/words by quantizing the feature space. Quantize via clustering; cluster centers are the visual words Word #2 Descriptor feature space Assign word

More information

Spatial Hierarchy of Textons Distributions for Scene Classification

Spatial Hierarchy of Textons Distributions for Scene Classification Spatial Hierarchy of Textons Distributions for Scene Classification S. Battiato 1, G. M. Farinella 1, G. Gallo 1, and D. Ravì 1 Image Processing Laboratory, University of Catania, IT {battiato, gfarinella,

More information

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba

Beyond bags of features: Adding spatial information. Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba Adding spatial information Forming vocabularies from pairs of nearby features doublets

More information

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking

Feature descriptors. Alain Pagani Prof. Didier Stricker. Computer Vision: Object and People Tracking Feature descriptors Alain Pagani Prof. Didier Stricker Computer Vision: Object and People Tracking 1 Overview Previous lectures: Feature extraction Today: Gradiant/edge Points (Kanade-Tomasi + Harris)

More information

GIST. GPU Implementation. Prakhar Jain ( ) Ejaz Ahmed ( ) 3 rd May, 2009

GIST. GPU Implementation. Prakhar Jain ( ) Ejaz Ahmed ( ) 3 rd May, 2009 GIST GPU Implementation Prakhar Jain ( 200601066 ) Ejaz Ahmed ( 200601028 ) 3 rd May, 2009 International Institute Of Information Technology, Hyderabad Table of Contents S. No. Topic Page No. 1 Abstract

More information

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS

SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS SUMMARY: DISTINCTIVE IMAGE FEATURES FROM SCALE- INVARIANT KEYPOINTS Cognitive Robotics Original: David G. Lowe, 004 Summary: Coen van Leeuwen, s1460919 Abstract: This article presents a method to extract

More information

Object Detection Using Segmented Images

Object Detection Using Segmented Images Object Detection Using Segmented Images Naran Bayanbat Stanford University Palo Alto, CA naranb@stanford.edu Jason Chen Stanford University Palo Alto, CA jasonch@stanford.edu Abstract Object detection

More information

Local Image Features

Local Image Features Local Image Features Ali Borji UWM Many slides from James Hayes, Derek Hoiem and Grauman&Leibe 2008 AAAI Tutorial Overview of Keypoint Matching 1. Find a set of distinctive key- points A 1 A 2 A 3 B 3

More information

2D Image Processing Feature Descriptors

2D Image Processing Feature Descriptors 2D Image Processing Feature Descriptors Prof. Didier Stricker Kaiserlautern University http://ags.cs.uni-kl.de/ DFKI Deutsches Forschungszentrum für Künstliche Intelligenz http://av.dfki.de 1 Overview

More information

LOCAL AND GLOBAL DESCRIPTORS FOR PLACE RECOGNITION IN ROBOTICS

LOCAL AND GLOBAL DESCRIPTORS FOR PLACE RECOGNITION IN ROBOTICS 8th International DAAAM Baltic Conference "INDUSTRIAL ENGINEERING - 19-21 April 2012, Tallinn, Estonia LOCAL AND GLOBAL DESCRIPTORS FOR PLACE RECOGNITION IN ROBOTICS Shvarts, D. & Tamre, M. Abstract: The

More information

String distance for automatic image classification

String distance for automatic image classification String distance for automatic image classification Nguyen Hong Thinh*, Le Vu Ha*, Barat Cecile** and Ducottet Christophe** *University of Engineering and Technology, Vietnam National University of HaNoi,

More information

Motion Estimation and Optical Flow Tracking

Motion Estimation and Optical Flow Tracking Image Matching Image Retrieval Object Recognition Motion Estimation and Optical Flow Tracking Example: Mosiacing (Panorama) M. Brown and D. G. Lowe. Recognising Panoramas. ICCV 2003 Example 3D Reconstruction

More information

Patch Descriptors. EE/CSE 576 Linda Shapiro

Patch Descriptors. EE/CSE 576 Linda Shapiro Patch Descriptors EE/CSE 576 Linda Shapiro 1 How can we find corresponding points? How can we find correspondences? How do we describe an image patch? How do we describe an image patch? Patches with similar

More information

Beyond bags of Features

Beyond bags of Features Beyond bags of Features Spatial Pyramid Matching for Recognizing Natural Scene Categories Camille Schreck, Romain Vavassori Ensimag December 14, 2012 Schreck, Vavassori (Ensimag) Beyond bags of Features

More information

Part based models for recognition. Kristen Grauman

Part based models for recognition. Kristen Grauman Part based models for recognition Kristen Grauman UT Austin Limitations of window-based models Not all objects are box-shaped Assuming specific 2d view of object Local components themselves do not necessarily

More information

Video annotation based on adaptive annular spatial partition scheme

Video annotation based on adaptive annular spatial partition scheme Video annotation based on adaptive annular spatial partition scheme Guiguang Ding a), Lu Zhang, and Xiaoxu Li Key Laboratory for Information System Security, Ministry of Education, Tsinghua National Laboratory

More information

Supervised learning. y = f(x) function

Supervised learning. y = f(x) function Supervised learning y = f(x) output prediction function Image feature Training: given a training set of labeled examples {(x 1,y 1 ),, (x N,y N )}, estimate the prediction function f by minimizing the

More information

III. VERVIEW OF THE METHODS

III. VERVIEW OF THE METHODS An Analytical Study of SIFT and SURF in Image Registration Vivek Kumar Gupta, Kanchan Cecil Department of Electronics & Telecommunication, Jabalpur engineering college, Jabalpur, India comparing the distance

More information

IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES

IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES IMAGE RETRIEVAL USING VLAD WITH MULTIPLE FEATURES Pin-Syuan Huang, Jing-Yi Tsai, Yu-Fang Wang, and Chun-Yi Tsai Department of Computer Science and Information Engineering, National Taitung University,

More information

CS 231A Computer Vision (Fall 2012) Problem Set 3

CS 231A Computer Vision (Fall 2012) Problem Set 3 CS 231A Computer Vision (Fall 2012) Problem Set 3 Due: Nov. 13 th, 2012 (2:15pm) 1 Probabilistic Recursion for Tracking (20 points) In this problem you will derive a method for tracking a point of interest

More information

Wei Liu SCENE IMAGE CLASSIFICATION AND RETRIEVAL. Master of Science Thesis. Examiners: Prof. Moncef Gabbouj and Prof.

Wei Liu SCENE IMAGE CLASSIFICATION AND RETRIEVAL. Master of Science Thesis. Examiners: Prof. Moncef Gabbouj and Prof. Wei Liu SCENE IMAGE CLASSIFICATION AND RETRIEVAL Master of Science Thesis Examiners: Prof. Moncef Gabbouj and Prof. Serkan Kiranyaz Examiners and topic approved in the Faculty of Computing and Electrical

More information

Selection of Scale-Invariant Parts for Object Class Recognition

Selection of Scale-Invariant Parts for Object Class Recognition Selection of Scale-Invariant Parts for Object Class Recognition Gy. Dorkó and C. Schmid INRIA Rhône-Alpes, GRAVIR-CNRS 655, av. de l Europe, 3833 Montbonnot, France fdorko,schmidg@inrialpes.fr Abstract

More information

Ensemble of Bayesian Filters for Loop Closure Detection

Ensemble of Bayesian Filters for Loop Closure Detection Ensemble of Bayesian Filters for Loop Closure Detection Mohammad Omar Salameh, Azizi Abdullah, Shahnorbanun Sahran Pattern Recognition Research Group Center for Artificial Intelligence Faculty of Information

More information

Tensor Decomposition of Dense SIFT Descriptors in Object Recognition

Tensor Decomposition of Dense SIFT Descriptors in Object Recognition Tensor Decomposition of Dense SIFT Descriptors in Object Recognition Tan Vo 1 and Dat Tran 1 and Wanli Ma 1 1- Faculty of Education, Science, Technology and Mathematics University of Canberra, Australia

More information

Evaluation of GIST descriptors for web scale image search

Evaluation of GIST descriptors for web scale image search Evaluation of GIST descriptors for web scale image search Matthijs Douze Hervé Jégou, Harsimrat Sandhawalia, Laurent Amsaleg and Cordelia Schmid INRIA Grenoble, France July 9, 2009 Evaluation of GIST for

More information

Announcements. Recognition. Recognition. Recognition. Recognition. Homework 3 is due May 18, 11:59 PM Reading: Computer Vision I CSE 152 Lecture 14

Announcements. Recognition. Recognition. Recognition. Recognition. Homework 3 is due May 18, 11:59 PM Reading: Computer Vision I CSE 152 Lecture 14 Announcements Computer Vision I CSE 152 Lecture 14 Homework 3 is due May 18, 11:59 PM Reading: Chapter 15: Learning to Classify Chapter 16: Classifying Images Chapter 17: Detecting Objects in Images Given

More information

Preliminary Local Feature Selection by Support Vector Machine for Bag of Features

Preliminary Local Feature Selection by Support Vector Machine for Bag of Features Preliminary Local Feature Selection by Support Vector Machine for Bag of Features Tetsu Matsukawa Koji Suzuki Takio Kurita :University of Tsukuba :National Institute of Advanced Industrial Science and

More information

Outline 7/2/201011/6/

Outline 7/2/201011/6/ Outline Pattern recognition in computer vision Background on the development of SIFT SIFT algorithm and some of its variations Computational considerations (SURF) Potential improvement Summary 01 2 Pattern

More information

CS 223B Computer Vision Problem Set 3

CS 223B Computer Vision Problem Set 3 CS 223B Computer Vision Problem Set 3 Due: Feb. 22 nd, 2011 1 Probabilistic Recursion for Tracking In this problem you will derive a method for tracking a point of interest through a sequence of images.

More information

Aggregating Descriptors with Local Gaussian Metrics

Aggregating Descriptors with Local Gaussian Metrics Aggregating Descriptors with Local Gaussian Metrics Hideki Nakayama Grad. School of Information Science and Technology The University of Tokyo Tokyo, JAPAN nakayama@ci.i.u-tokyo.ac.jp Abstract Recently,

More information

Mutual Information Based Codebooks Construction for Natural Scene Categorization

Mutual Information Based Codebooks Construction for Natural Scene Categorization Chinese Journal of Electronics Vol.20, No.3, July 2011 Mutual Information Based Codebooks Construction for Natural Scene Categorization XIE Wenjie, XU De, TANG Yingjun, LIU Shuoyan and FENG Songhe (Institute

More information

Object Classification Problem

Object Classification Problem HIERARCHICAL OBJECT CATEGORIZATION" Gregory Griffin and Pietro Perona. Learning and Using Taxonomies For Fast Visual Categorization. CVPR 2008 Marcin Marszalek and Cordelia Schmid. Constructing Category

More information

CHAPTER 5 GLOBAL AND LOCAL FEATURES FOR FACE RECOGNITION

CHAPTER 5 GLOBAL AND LOCAL FEATURES FOR FACE RECOGNITION 122 CHAPTER 5 GLOBAL AND LOCAL FEATURES FOR FACE RECOGNITION 5.1 INTRODUCTION Face recognition, means checking for the presence of a face from a database that contains many faces and could be performed

More information

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009

Learning and Inferring Depth from Monocular Images. Jiyan Pan April 1, 2009 Learning and Inferring Depth from Monocular Images Jiyan Pan April 1, 2009 Traditional ways of inferring depth Binocular disparity Structure from motion Defocus Given a single monocular image, how to infer

More information

Improving Recognition through Object Sub-categorization

Improving Recognition through Object Sub-categorization Improving Recognition through Object Sub-categorization Al Mansur and Yoshinori Kuno Graduate School of Science and Engineering, Saitama University, 255 Shimo-Okubo, Sakura-ku, Saitama-shi, Saitama 338-8570,

More information

Beyond Bags of features Spatial information & Shape models

Beyond Bags of features Spatial information & Shape models Beyond Bags of features Spatial information & Shape models Jana Kosecka Many slides adapted from S. Lazebnik, FeiFei Li, Rob Fergus, and Antonio Torralba Detection, recognition (so far )! Bags of features

More information

CLASSIFICATION Experiments

CLASSIFICATION Experiments CLASSIFICATION Experiments January 27,2015 CS3710: Visual Recognition Bhavin Modi Bag of features Object Bag of words 1. Extract features 2. Learn visual vocabulary Bag of features: outline 3. Quantize

More information

By Suren Manvelyan,

By Suren Manvelyan, By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 By Suren Manvelyan, http://www.surenmanvelyan.com/gallery/7116 By Suren Manvelyan,

More information

SEMANTIC SEGMENTATION AS IMAGE REPRESENTATION FOR SCENE RECOGNITION. Ahmed Bassiouny, Motaz El-Saban. Microsoft Advanced Technology Labs, Cairo, Egypt

SEMANTIC SEGMENTATION AS IMAGE REPRESENTATION FOR SCENE RECOGNITION. Ahmed Bassiouny, Motaz El-Saban. Microsoft Advanced Technology Labs, Cairo, Egypt SEMANTIC SEGMENTATION AS IMAGE REPRESENTATION FOR SCENE RECOGNITION Ahmed Bassiouny, Motaz El-Saban Microsoft Advanced Technology Labs, Cairo, Egypt ABSTRACT We introduce a novel approach towards scene

More information

Multiple-Choice Questionnaire Group C

Multiple-Choice Questionnaire Group C Family name: Vision and Machine-Learning Given name: 1/28/2011 Multiple-Choice naire Group C No documents authorized. There can be several right answers to a question. Marking-scheme: 2 points if all right

More information

Recognize Complex Events from Static Images by Fusing Deep Channels Supplementary Materials

Recognize Complex Events from Static Images by Fusing Deep Channels Supplementary Materials Recognize Complex Events from Static Images by Fusing Deep Channels Supplementary Materials Yuanjun Xiong 1 Kai Zhu 1 Dahua Lin 1 Xiaoou Tang 1,2 1 Department of Information Engineering, The Chinese University

More information

KNOWING Where am I has always being an important

KNOWING Where am I has always being an important CENTRIST: A VISUAL DESCRIPTOR FOR SCENE CATEGORIZATION 1 CENTRIST: A Visual Descriptor for Scene Categorization Jianxin Wu, Member, IEEE and James M. Rehg, Member, IEEE Abstract CENTRIST (CENsus TRansform

More information

Content-Based Image Classification: A Non-Parametric Approach

Content-Based Image Classification: A Non-Parametric Approach 1 Content-Based Image Classification: A Non-Parametric Approach Paulo M. Ferreira, Mário A.T. Figueiredo, Pedro M. Q. Aguiar Abstract The rise of the amount imagery on the Internet, as well as in multimedia

More information

Human Motion Detection and Tracking for Video Surveillance

Human Motion Detection and Tracking for Video Surveillance Human Motion Detection and Tracking for Video Surveillance Prithviraj Banerjee and Somnath Sengupta Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur,

More information

CS229: Action Recognition in Tennis

CS229: Action Recognition in Tennis CS229: Action Recognition in Tennis Aman Sikka Stanford University Stanford, CA 94305 Rajbir Kataria Stanford University Stanford, CA 94305 asikka@stanford.edu rkataria@stanford.edu 1. Motivation As active

More information

The SIFT (Scale Invariant Feature

The SIFT (Scale Invariant Feature The SIFT (Scale Invariant Feature Transform) Detector and Descriptor developed by David Lowe University of British Columbia Initial paper ICCV 1999 Newer journal paper IJCV 2004 Review: Matt Brown s Canonical

More information

IMPROVING SPATIO-TEMPORAL FEATURE EXTRACTION TECHNIQUES AND THEIR APPLICATIONS IN ACTION CLASSIFICATION. Maral Mesmakhosroshahi, Joohee Kim

IMPROVING SPATIO-TEMPORAL FEATURE EXTRACTION TECHNIQUES AND THEIR APPLICATIONS IN ACTION CLASSIFICATION. Maral Mesmakhosroshahi, Joohee Kim IMPROVING SPATIO-TEMPORAL FEATURE EXTRACTION TECHNIQUES AND THEIR APPLICATIONS IN ACTION CLASSIFICATION Maral Mesmakhosroshahi, Joohee Kim Department of Electrical and Computer Engineering Illinois Institute

More information

Affine-invariant scene categorization

Affine-invariant scene categorization University of Wollongong Research Online Faculty of Engineering and Information Sciences - Papers: Part A Faculty of Engineering and Information Sciences 2014 Affine-invariant scene categorization Xue

More information

A Comparison of SIFT, PCA-SIFT and SURF

A Comparison of SIFT, PCA-SIFT and SURF A Comparison of SIFT, PCA-SIFT and SURF Luo Juan Computer Graphics Lab, Chonbuk National University, Jeonju 561-756, South Korea qiuhehappy@hotmail.com Oubong Gwun Computer Graphics Lab, Chonbuk National

More information

Developing Open Source code for Pyramidal Histogram Feature Sets

Developing Open Source code for Pyramidal Histogram Feature Sets Developing Open Source code for Pyramidal Histogram Feature Sets BTech Project Report by Subodh Misra subodhm@iitk.ac.in Y648 Guide: Prof. Amitabha Mukerjee Dept of Computer Science and Engineering IIT

More information

Specular 3D Object Tracking by View Generative Learning

Specular 3D Object Tracking by View Generative Learning Specular 3D Object Tracking by View Generative Learning Yukiko Shinozuka, Francois de Sorbier and Hideo Saito Keio University 3-14-1 Hiyoshi, Kohoku-ku 223-8522 Yokohama, Japan shinozuka@hvrl.ics.keio.ac.jp

More information

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt. CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt. Section 10 - Detectors part II Descriptors Mani Golparvar-Fard Department of Civil and Environmental Engineering 3129D, Newmark Civil Engineering

More information

TEXTURE CLASSIFICATION METHODS: A REVIEW

TEXTURE CLASSIFICATION METHODS: A REVIEW TEXTURE CLASSIFICATION METHODS: A REVIEW Ms. Sonal B. Bhandare Prof. Dr. S. M. Kamalapur M.E. Student Associate Professor Deparment of Computer Engineering, Deparment of Computer Engineering, K. K. Wagh

More information

Click to edit title style

Click to edit title style Class 2: Low-level Representation Liangliang Cao, Jan 31, 2013 EECS 6890 Topics in Information Processing Spring 2013, Columbia University http://rogerioferis.com/visualrecognitionandsearch Visual Recognition

More information

The most cited papers in Computer Vision

The most cited papers in Computer Vision COMPUTER VISION, PUBLICATION The most cited papers in Computer Vision In Computer Vision, Paper Talk on February 10, 2012 at 11:10 pm by gooly (Li Yang Ku) Although it s not always the case that a paper

More information

Performance Evaluation of Scale-Interpolated Hessian-Laplace and Haar Descriptors for Feature Matching

Performance Evaluation of Scale-Interpolated Hessian-Laplace and Haar Descriptors for Feature Matching Performance Evaluation of Scale-Interpolated Hessian-Laplace and Haar Descriptors for Feature Matching Akshay Bhatia, Robert Laganière School of Information Technology and Engineering University of Ottawa

More information

Visual localization using global visual features and vanishing points

Visual localization using global visual features and vanishing points Visual localization using global visual features and vanishing points Olivier Saurer, Friedrich Fraundorfer, and Marc Pollefeys Computer Vision and Geometry Group, ETH Zürich, Switzerland {saurero,fraundorfer,marc.pollefeys}@inf.ethz.ch

More information

Prof. Feng Liu. Spring /26/2017

Prof. Feng Liu. Spring /26/2017 Prof. Feng Liu Spring 2017 http://www.cs.pdx.edu/~fliu/courses/cs510/ 04/26/2017 Last Time Re-lighting HDR 2 Today Panorama Overview Feature detection Mid-term project presentation Not real mid-term 6

More information

A Comparison and Matching Point Extraction of SIFT and ISIFT

A Comparison and Matching Point Extraction of SIFT and ISIFT A Comparison and Matching Point Extraction of SIFT and ISIFT A. Swapna A. Geetha Devi M.Tech Scholar, PVPSIT, Vijayawada Associate Professor, PVPSIT, Vijayawada bswapna.naveen@gmail.com geetha.agd@gmail.com

More information

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm

EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm EE368 Project Report CD Cover Recognition Using Modified SIFT Algorithm Group 1: Mina A. Makar Stanford University mamakar@stanford.edu Abstract In this report, we investigate the application of the Scale-Invariant

More information

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014

SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT SIFT: Scale Invariant Feature Transform; transform image

More information

SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY. Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang

SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY. Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang SEARCH BY MOBILE IMAGE BASED ON VISUAL AND SPATIAL CONSISTENCY Xianglong Liu, Yihua Lou, Adams Wei Yu, Bo Lang State Key Laboratory of Software Development Environment Beihang University, Beijing 100191,

More information

Short Survey on Static Hand Gesture Recognition

Short Survey on Static Hand Gesture Recognition Short Survey on Static Hand Gesture Recognition Huu-Hung Huynh University of Science and Technology The University of Danang, Vietnam Duc-Hoang Vo University of Science and Technology The University of

More information

A Novel Extreme Point Selection Algorithm in SIFT

A Novel Extreme Point Selection Algorithm in SIFT A Novel Extreme Point Selection Algorithm in SIFT Ding Zuchun School of Electronic and Communication, South China University of Technolog Guangzhou, China zucding@gmail.com Abstract. This paper proposes

More information

Evaluation of Local Space-time Descriptors based on Cuboid Detector in Human Action Recognition

Evaluation of Local Space-time Descriptors based on Cuboid Detector in Human Action Recognition International Journal of Innovation and Applied Studies ISSN 2028-9324 Vol. 9 No. 4 Dec. 2014, pp. 1708-1717 2014 Innovative Space of Scientific Research Journals http://www.ijias.issr-journals.org/ Evaluation

More information

Sparse coding for image classification

Sparse coding for image classification Sparse coding for image classification Columbia University Electrical Engineering: Kun Rong(kr2496@columbia.edu) Yongzhou Xiang(yx2211@columbia.edu) Yin Cui(yc2776@columbia.edu) Outline Background Introduction

More information

Exploring Bag of Words Architectures in the Facial Expression Domain

Exploring Bag of Words Architectures in the Facial Expression Domain Exploring Bag of Words Architectures in the Facial Expression Domain Karan Sikka, Tingfan Wu, Josh Susskind, and Marian Bartlett Machine Perception Laboratory, University of California San Diego {ksikka,ting,josh,marni}@mplab.ucsd.edu

More information

Visual Object Recognition

Visual Object Recognition Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Visual Object Recognition Bastian Leibe Computer Vision Laboratory ETH Zurich Chicago, 14.07.2008 & Kristen Grauman Department

More information

SCALE INVARIANT FEATURE TRANSFORM (SIFT)

SCALE INVARIANT FEATURE TRANSFORM (SIFT) 1 SCALE INVARIANT FEATURE TRANSFORM (SIFT) OUTLINE SIFT Background SIFT Extraction Application in Content Based Image Search Conclusion 2 SIFT BACKGROUND Scale-invariant feature transform SIFT: to detect

More information

A NEW FEATURE BASED IMAGE REGISTRATION ALGORITHM INTRODUCTION

A NEW FEATURE BASED IMAGE REGISTRATION ALGORITHM INTRODUCTION A NEW FEATURE BASED IMAGE REGISTRATION ALGORITHM Karthik Krish Stuart Heinrich Wesley E. Snyder Halil Cakir Siamak Khorram North Carolina State University Raleigh, 27695 kkrish@ncsu.edu sbheinri@ncsu.edu

More information

Fuzzy based Multiple Dictionary Bag of Words for Image Classification

Fuzzy based Multiple Dictionary Bag of Words for Image Classification Available online at www.sciencedirect.com Procedia Engineering 38 (2012 ) 2196 2206 International Conference on Modeling Optimisation and Computing Fuzzy based Multiple Dictionary Bag of Words for Image

More information

Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions

Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions Extracting Spatio-temporal Local Features Considering Consecutiveness of Motions Akitsugu Noguchi and Keiji Yanai Department of Computer Science, The University of Electro-Communications, 1-5-1 Chofugaoka,

More information

Invariant Features of Local Textures a rotation invariant local texture descriptor

Invariant Features of Local Textures a rotation invariant local texture descriptor Invariant Features of Local Textures a rotation invariant local texture descriptor Pranam Janney and Zhenghua Yu 1 School of Computer Science and Engineering University of New South Wales Sydney, Australia

More information

Recognition with Bag-ofWords. (Borrowing heavily from Tutorial Slides by Li Fei-fei)

Recognition with Bag-ofWords. (Borrowing heavily from Tutorial Slides by Li Fei-fei) Recognition with Bag-ofWords (Borrowing heavily from Tutorial Slides by Li Fei-fei) Recognition So far, we ve worked on recognizing edges Now, we ll work on recognizing objects We will use a bag-of-words

More information

Bag of Words Models. CS4670 / 5670: Computer Vision Noah Snavely. Bag-of-words models 11/26/2013

Bag of Words Models. CS4670 / 5670: Computer Vision Noah Snavely. Bag-of-words models 11/26/2013 CS4670 / 5670: Computer Vision Noah Snavely Bag-of-words models Object Bag of words Bag of Words Models Adapted from slides by Rob Fergus and Svetlana Lazebnik 1 Object Bag of words Origin 1: Texture Recognition

More information

Image Processing. Image Features

Image Processing. Image Features Image Processing Image Features Preliminaries 2 What are Image Features? Anything. What they are used for? Some statements about image fragments (patches) recognition Search for similar patches matching

More information

Shape Descriptor using Polar Plot for Shape Recognition.

Shape Descriptor using Polar Plot for Shape Recognition. Shape Descriptor using Polar Plot for Shape Recognition. Brijesh Pillai ECE Graduate Student, Clemson University bpillai@clemson.edu Abstract : This paper presents my work on computing shape models that

More information

CPPP/UFMS at ImageCLEF 2014: Robot Vision Task

CPPP/UFMS at ImageCLEF 2014: Robot Vision Task CPPP/UFMS at ImageCLEF 2014: Robot Vision Task Rodrigo de Carvalho Gomes, Lucas Correia Ribas, Amaury Antônio de Castro Junior, Wesley Nunes Gonçalves Federal University of Mato Grosso do Sul - Ponta Porã

More information

Basic Problem Addressed. The Approach I: Training. Main Idea. The Approach II: Testing. Why a set of vocabularies?

Basic Problem Addressed. The Approach I: Training. Main Idea. The Approach II: Testing. Why a set of vocabularies? Visual Categorization With Bags of Keypoints. ECCV,. G. Csurka, C. Bray, C. Dance, and L. Fan. Shilpa Gulati //7 Basic Problem Addressed Find a method for Generic Visual Categorization Visual Categorization:

More information

Computer Vision for HCI. Topics of This Lecture

Computer Vision for HCI. Topics of This Lecture Computer Vision for HCI Interest Points Topics of This Lecture Local Invariant Features Motivation Requirements, Invariances Keypoint Localization Features from Accelerated Segment Test (FAST) Harris Shi-Tomasi

More information

Scene Classification with Low-dimensional Semantic Spaces and Weak Supervision

Scene Classification with Low-dimensional Semantic Spaces and Weak Supervision Scene Classification with Low-dimensional Semantic Spaces and Weak Supervision Nikhil Rasiwasia Nuno Vasconcelos Department of Electrical and Computer Engineering University of California, San Diego nikux@ucsd.edu,

More information

Feature Based Registration - Image Alignment

Feature Based Registration - Image Alignment Feature Based Registration - Image Alignment Image Registration Image registration is the process of estimating an optimal transformation between two or more images. Many slides from Alexei Efros http://graphics.cs.cmu.edu/courses/15-463/2007_fall/463.html

More information

CS 231A Computer Vision (Fall 2011) Problem Set 4

CS 231A Computer Vision (Fall 2011) Problem Set 4 CS 231A Computer Vision (Fall 2011) Problem Set 4 Due: Nov. 30 th, 2011 (9:30am) 1 Part-based models for Object Recognition (50 points) One approach to object recognition is to use a deformable part-based

More information

A region-dependent image matching method for image and video annotation

A region-dependent image matching method for image and video annotation A region-dependent image matching method for image and video annotation Golnaz Abdollahian, Murat Birinci, Fernando Diaz-de-Maria, Moncef Gabbouj,EdwardJ.Delp Video and Image Processing Laboratory (VIPER

More information